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Abstract 

Providing the neurobiological basis of information processing in higher animals, spiking neural net¬ 
works must be able to learn a variety of complicated computations, including the generation of ap¬ 
propriate, possibly delayed reactions to inputs and the self-sustained generation of complex activity 
patterns, e.g. for locomotion. Many such computations require previous building of intrinsic world 
models. Here we show how spiking neural networks may solve these different tasks. Firstly, we 
derive constraints under which classes of spiking neural networks lend themselves to substrates of 
powerful general purpose computing. The networks contain dendritic or synaptic nonlinearities and 
have a constrained connectivity. We then combine such networks with learning rules for outputs or 
recurrent connections. We show that this allows to learn even difficult benchmark tasks such as the 
self-sustained generation of desired low-dimensional chaotic dynamics or memory-dependent com¬ 
putations. Furthermore, we show how spiking networks can build models of external world systems 
and use the acquired knowledge to control them. 


Author Summary 

Animals and humans can learn versatile computations such as the generation of complicated activity 
patterns to steer movements or the generation of appropriate outputs in response to inputs. Such 
learning must be accomplished by networks of nerve cells in the brain, which communicate with short 
electrical impulses, so-called spikes. Here we show how such networks may perform the learning. 
We track their ability back to experimentally found nonlinearities in the couplings between nerve cells 
and to a network connectivity that complies with constraints. We show that the spiking networks are 
able to learn difficult tasks such as the generation of desired chaotic activity and the prediction of 
the impact of actions on the environment. The latter allows to compute optimal actions by mental 
exploration. 
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Introduction 


The understanding of neural network dynamics on the mesoscopic level of hundreds and thousands 
of neurons and their ability to learn highly complicated computations is a fundamental open challenge 
in neuroscience. For biological systems, such an understanding will allow to connect the microscopic 
level of single neurons and the macroscopic level of cognition and behavior. In artificial computing, it 
may allow to propose new, possibly more efficient computing schemes. 

Randomly connected mesoscopic networks can be a suitable substrate for computations |Il|4l|3l|5l 
[5], as they reflect the input in a complicated, nonlinear way and at the same time maintain, like a com¬ 
putational “reservoir”, fading memory of past inputs as well as of transformations and combinations of 
them. This includes the results of computations on current and past inputs. Simple readout neurons 
may then learn to extract the desired result; the computations are executed in real time, i.e. with¬ 
out the need to wait for convergence to an attractor (“reservoir computing”) (Hill. Non-random and 
adaptive network connectivity can change performance |[6llZl[8l. 

Networks with higher computational power, in particular with the additional ability to learn self- 
sustained patterns of activity and persistent memory, require an output feedback or equivalent learn¬ 
ing of their recurrent connections @1 (S). However, network modeling approaches achieving such 
universal (i.e. general purpose) computational capabilities so far concentrated on networks of con¬ 
tinuous rate units (HE], which do not take into account the characteristics that neurons in biological 
neural networks communicate via spikes. Indeed, the dynamics of spiking neural networks are dis¬ 
continuous, usually highly chaotic, variable, and noisy. Readouts of such spiking networks show low 
signal-to-noise ratios. This hinders computations following the described principle in particular in 
presence of feedback or equivalent plastic recurrent connections, and has questioned it as model for 
computations in biological neural systems l^ fT^fTTI . 

Here we first introduce a class of recurrent spiking neural networks that are suited as a substrate 
to learn universal computations. They are based on standard, established neuron models, take into 
account synaptic or dendritic nonlinearities and are required to respect some structural constraints 
regarding the connectivity of the network. To derive them we employ a precise spike coding scheme 
similar to ref. [3], which was introduced to approximate linear continuous dynamics. 

Thereafter we endow the introduced spiking networks with learning rules for either the output or the 
recurrent connection weights and show that this enables them to learn equally complicated, mem¬ 
ory dependent computations as non-spiking continuous rate networks. The spiking networks we are 
using have only medium sizes, between tens and a few thousands of neurons, like networks of rate 
neurons employed for similar tasks. We demonstrate the capabilities of our networks by applying 
them to challenging learning problems which are of importance in biological contexts. In particular, 
we show how spiking neural networks can learn the self-sustained generation of complicated dy¬ 
namical patterns, and how they can build world models, which allow to compute optimal actions to 
appropriately influence an environment. 
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Results 


Continuous signal coding spiking neural networks (CSNs) 

Network architecture 

For our study, we use leaky integrate-and-fire neurons. These incorporate crucial features of biologi¬ 
cal neurons, such as operation in continuous time, spike generation and reset, while also maintaining 
some degree of analytical tractability. A network consists of N neurons. The state of a neuron n is 
given by its membrane potential Vn{t). The membrane potential performs a leaky integration of the 
input and a spike is generated when t 4 (t) reaches a threshold, resulting in a spiketrain 

Sn{t) = - tn) (1) 

i’U 

with spike times tn and the Dirac delta-distribution S. After a spike, the neuron is reset to the reset 
potential, which lies 6 below the threshold. The spike train generates a train of exponentially decaying 
normalized synaptic currents 

^ - tn) rn{t) = -Xsrn{t) + Sn{t), (2) 

where Ts = is the time constant of the synaptic decay and 0(.) is the Heaviside theta-function. 

Throughout the article we consider two closely related types of neurons, neurons with saturating 
synapses and neurons with nonlinear dendrites (cf. Fig. 1). In the model with saturating synapses 
(Fig. la), the membrane potential Vn{t) of neuron n obeys 

N 

Vn{t) = — XvVn{t) + E ^nm tanh {'^rm{t)) + V)A^r„(i) 
m=l 

— 6Sn{t) -|- /e,n(i), (3) 

with membrane time constant 

The saturation of synapses, e.g. due to receptor saturation or finite reversal potentials, acts as a 
nonlinear transfer function (l3l[T4], which we model as a tanh-nonlinearity (since r^it) > 0 only the 
positive part of the tanh becomes effective). We note that this may also be interpreted as a simple 
implementation of synaptic depression: A spike generated by neuron m at tm leads to an increase of 
Tmitm) by 1. As long as the synapse connecting neuron m to neuron n is far from saturation (linear 
part of the tanh-function) this leads to the consumption of a fraction 7 of the synaptic “resources” and 
the effect of the spike on the neuron is approximately the effect of a current - 

tm)- When a larger number of such spikes arrive in short time such that the consumed resources 
accumulate to 1 and beyond, the synapse saturates at its maximum strength Anm and the effect of 
individual inputs is much smaller than before. The recovery from depression is here comparably fast, 
it takes place on a timescale of A^^ (compare, e.g., [T5J). 

The reset of the neuron is incorporated by the term -9sn{t). The voltage lost due to this reset 
is partially recovered by a slow recovery current (afterdepolarization) VrXsrn{t)\ its temporally inte¬ 
grated size is given by the parameter V). This is a feature of many neurons e.g. in the neocortex. 
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Figure 1: Coding of continuous signals in neurons with saturating synapses (a,b) and nonlinear dendrites (c,d). 

(a,b): A neuron with saturating synapses (a) that directty codes for a continuous signai (b). Panei (a) dispiays the neuron 
with an axon (red) and dendrites (dark btue) that receive inputs from the axons of other neurons (axons at the bottom) 
via saturating synapses (symboiized by sigmoids at the synaptic contacts). The currents entering the soma are weighted 
sums of input spike trains that are synapticalty fiitered (generating scaied normaiized synaptic currents -yrn(t), synaptic 
time scaie rj and thereafter subject to a saturating synaptic noniinearity. Externai inputs (axons at the top) are received 
without saturation. The continuous signai x{t) (panei b ieft hand side, green) is the sum of the neuron’s membrane potentiai 
V(t) (red) and its seated normaiized synaptic current 9r(t) (dark btue). r(t) is a tow-pass tittered version of the neuron’s 
spike train s{t) (tight blue in red box). Ifx{t) > 0, the time scale ofx{t) should be large against the synaptic time scale 
and x{t) should predominantly be large against the neuron’s threshold, s /2 (panel b right hand side, assumptions [1,2] in 
the main text). x{t) is then already well approximated by 9r(t), while V (t) is oscillating between Ifx{t) < 0, we have 
V(t) < 0, no spikes are generated and r{t) quickly decays to zero, such that we predominantly have r{t) « 0 and x{t) is 
well approximated by V{t) (cf Equation 

(c,d): Two neurons with nonlinear dendrites (c) from a larger network that distributedly codes for a continuous signal (d). (c): 
Each neuron has an axon (red) and different types of dendrites (cyan, light blue and dark blue) that receive inputs from the 
axons of other neurons (axons at the bottom) via fast or slow conventional synapses (highlighted by circles and squares). 
Linear dendrites with slow synapses (cyan with circle contacts) generate somatic currents that are weighted linear sums 
of low-pass filtered presynaptic spike trains (weighted sums of the r„(t)). Linear dendrites with fast synapses (light blue 
with square contacts) generate somatic currents with negligible filtering (weighted sums of the spike trains s„{t)). Spikes 
arriving at a nonlinear dendrite (dark blue) are also filtered (circular contact). The resulting rn{t) are weighted, summed up 
linearly in the dendrite and subjected to a saturating dendritic nonlinearity (symbolized by sigmoids at dendrites), before 
entering the soma. VJe assume that the neurons have nonlinear dendrites that are located in similar tissue areas, such 
that they connect to the same sets of axons and receive similar inputs, (d): All neurons in the network together encode 
J continuous signals x(t) (one displayed in green) by a weighted sum of their membrane potentials V{t) (two traces of 
different neurons displayed in red) and their normalized PSCs r(t) (two traces displayed in cyan). The Tv{t) alone already 
approximate x(i) well. The neurons’ output spike trains s{t) (light blue in red box) generate slow and fast inputs to other 
neurons. (Note that spikes can be generated due to suprathreshold excitation by fast inputs. Since we plot V (t) after fast 
inputs and possible resets, the corresponding threshold crossings do not appear.) 
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in the hippocampus and in the cerebellum [T6], and may be caused by different types of somatic or 
dendritic currents, such as persistent and resurgent sodium and calcium currents, or by excitatory 
autapses p71|T8]. It provides a simple mechanism to sustain (fast) spiking and generate bursts, e.g. 
in response to pulses. Ie,n{t) is an external input, its constant part may be interpreted as sampling 
slow inputs specifying the resting potential that the neuron asymptotically assumes for long times 
without any recurrent network input. We assume that the resting potential is halfway between the 
reset potential V)es and the threshold t4es + 6*. We set it to zero such that the neuron spikes when 
the membrane potential reaches s/2 and resets to -s/2. To test the robustness of the dynamics we 
sometimes add a white noise input rin{t) satisfying {rin{t)rim{t')) = -1') with the Kronecker 

delta 6nm- 

For simplicity, we take the parameters Ay, 9, V) and 7, As identical for all neurons and synapses, 
respectively. We take the membrane potential t4 and the parameters V) and 9 dimensionless, they 
can be fit to the voltage scale of biological neurons by rescaling with an additive and a multiplicative 
dimensionful constant. Time is measured in seconds. 

We find that networks of the form Equation ^ generate dynamics suitable for universal computation 
similar to continuous rate networks |4l [5], if 0 < A^, < As, where A^, = As (l - ^), Anm sufficiently 
large and 7 small. The conditions result from requiring the network to approximate a nonlinear con¬ 
tinuous dynamical system (see next section). 

An alternative interpretation of the introduced nonlinearity is that the neurons have nonlinear den¬ 
drites, where each nonlinear compartment is small such that it receives at most one (conventional, 
nonsaturating) synapse. Anm is then the strength of the coupling from a dendritic compartment to the 
soma. This interpretation suggests an extension of the neuron model allowing for several dendrites 
per neuron, where the inputs are linearly summed up and then subjected to a saturating dendritic 
nonlinearity [HI [2011^. Like the previous model, we find that such a model has to satisfy additional 
constraints to be suitable for universal computation: 

Neurons with nonlinear dendrites need additional slow and fast synaptic contacts which arrive near 
the soma and are summed linearly there (Fig. 1c). Such structuring has been found in biological 
neural networks [22]. We gather the different components into a dynamical equation for 14 as 



N 


N 





(4) 


Dnj is the coupling from the jth dendrite of neuron n to its soma. The total number of dendrites and 
neurons is referred to as J and N respectively. Wnjm is the coupling strength from neuron m to the 
jth nonlinear dendrite of neuron n. The slow, significantly temporally filtered inputs from neuron m 
to the soma of neuron n, Unmrm{t), have connection strengths Unm- The fast ones, UnmSm{t), have 
negligible synaptic filtering (i.e. negligible synaptic rise and decay times) as well as negligible conduc¬ 
tion delays. The resets and recoveries are incorporated as diagonal elements of the matrices Unm 
and Unm- To test the robustness of the dynamics, also here we sometimes add a white noise input 
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explanation 

optimal value 

Dnj 

coupling from the jth dendrite 
of neuron nto its soma 

Dnj = X]i=l ^inAij 

^^njm 

coupling strength from neuron m to 
the jth nonlinear dendrite of neuron n 

^ jm 

Unm 

slow coupling from neuron m to neuron n; 
diagonal elements incorporate 
a recovery current 

Unm — ^^^j=l^j'DX'jm l^^s^nm 

Qj — \g 

Unm 

fast coupling from neuron m to neuron re; 
diagonal elements incorporate the reset 

Unm — ^^j=i ^jn^jm f^^nm 

gn 

threshold of neuron re 

f)Tl _ Unn 

^ ~ 2 


Table 1: Parameters of a network of neurons with nonlinear dendrites (of Equation and their optimal values. 

r}n{t)- To increase the richness of the recurrent dynamics and the computational power of the network 
(cf. [23J for disconnected units without output feedback) we added inhomogeneity, e.g. through the 
external input current in some tasks. In the control/mental exploration task, we added a constant bias 
term bj as argument of the tanh to introduce inhomogeneity. 

We find that the network couplings D, W, U and U @ (we use bold letters for vectors and matrices) 
should satisfy certain interrelations. As motivated in the subsequent section and derived in the sup¬ 
porting material, their components may be expressed in terms of the components of a J x matrix 
r, and a J X J matrix A as Dnj = T,i=i^inAij, Wnjm = Tjm, Unm — T 

Unm = I2j=i ^jn^^jm + p^nm, Where a = Xg - Xx and p>0\s small (see also Table 1 for an overview). 
The thresholds are chosen identical, 0” = 6, see Methods. 

Again, the conditions result from requiring the network to approximate a nonlinear continuous dy¬ 
namical system. This system. Equation ([jl), is characterized by the J x J coupling matrix A and 
a J-dimensional input c{t) whose components are identical to the J independent components of 
the external input current Ig in equation Q; the matrix r is a decoding matrix that fixes the rela¬ 
tion between spiking and continuous dynamics (see next section). We note that the matrices r and 
A are largely unconstrained, such that the coupling strengths maintain a large degree of arbitrari¬ 
ness. Ideally, Wnjm is independent of n, therefore neurons have dendrites that are similar in their 
input characteristics to dendrites in some other neurons (note that D may have zero entries, so den¬ 
drites can be absent). We interpret these as dendrites that are located in a similar tissue area and 
therefore connect to the same axons and receive similar inputs (cf. Fig. 1c for an illustration). The 
interrelations between the coupling matrices might be realized by spike-timing dependent synaptic or 
structural plasticity. Indeed, for a simpler model and task, appropriate biologically plausible learning 
rules have been recently highlighted |2l [25]. We tested robustness of our schemes against struc¬ 
tural perturbations (see Figs. C, D in SI Supporting Information), in particular for deviations from the 
n-independence of Wnjm (Fig. C in SI Supporting Information). 

The networks Equation ^ with saturating synapses have a largely unconstrained topology, in partic¬ 
ular they can satisfy the rule that neurons usually act only excitatorily or inhibitorily. For the networks 
Equation (Q with nonlinear dendrites, it is less obvious how to reconcile the rule with the constraints 
on the network connectivity. Solutions for this have been suggested in simpler systems and are 
subject to current research [3]. 

The key property of the introduced neural architecture is that the spike trains generated by the neu¬ 
rons encode with high signal-to-noise ratio a continuous signal that can be understood in terms of 
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ordinary differential equations. In the following section we show how this signal is decoded from the 
spike trains. Thereafter, we may conclude that the spiking dynamics are sufficiently “tamed” such 
that standard learning rules can be applied to learn complicated computations. 

Direct encoding of continuous dynamics 

The dynamics of a neural network with N integrate-and-fire neurons consist of two components, 
the sub-threshold dynamics V(t) = of the membrane potentials and the spike 

trains s{t) = (Equation [i}, which are temporal sequences of ^-distributions. In the 

model with saturating synapses, all synaptic interactions are assumed to be significantly temporally 
filtered, such that the Vn{t) are continuous except at reset times after spiking (Equation ^). We 
posit that the V(t) and the s(t) should together form some iV-dimensional continuous dynamics 
x(t) = ...,XNit))'^. The simplest approach is to setup x(i) as a linear combination of the 

two components V(t) and s(t). To avoid infinities in x„(t), we need to eliminate the occurring <5- 
distributions by employing a smoothed version of Sn{t). This should have a finite discontinuity at 
spike times such that the discontinuity in Vn{t) can be balanced. A straightforward choice is to use 
Ovnit) (Equation 0) and to set 

Vn{t) +ern{t) = Xnit). (5) 

(cf. Fig. 1b). When the abovementioned conditions on A^, As, A and 7 are satisfied (cf. end of the 
section introducing networks with saturating synapses), the continuous signal x(t) follows a system 
of first order nonlinear ordinary differential equations similar to those describing standard non-spiking 
continuous rate networks used for computations (cf. and Equation ([jl) below), 

N 

[®n(t')]_ [3^n(t)]_(_ “I" ^ ^ tanh. [Xm(^)]_|_^ -|- (6) 

m=l 

with the rectifications [xn{t)]^ = max(xn(i),0), [xn{t)]_ = min(x„(t),0). We call spiking networks 
where this is the case continuous signal coding spiking neural networks (CSNs). 

Except for the rectifications. Equation ^ has a standard form for non-spiking continuous rate net¬ 
works, used for computations HI [5] i6|. A salient choice for A^; is A^; = Ay, i.e. 14= (l- 6, 

such that the rectifications outside the tanh-nonlinearity vanish. Equation ^ generates dynamics 
that are different from the standard ones in the respect that the trajectories of individual neurons are, 
e.g. for random Gaussian matrices A, not centered at zero. However, they can satisfy the conditions 
for universal computation (enslaveability/echo state property and high dimensional nonlinear dynam¬ 
ics) and generate longer-term fading memory for appropriate scaling of A. Also the corresponding 
spiking networks are then suitable for fading memory-dependent computations. Like for the standard 
networks |lZl[28], we can derive sufficient conditions to guarantee that the dynamics Equation ^ are 
enslaveable by external signals (echo state property). ||A|| < min(Ay, A^), where ||A|| is the largest 
singular value of the matrix A, provides such a condition (see Supplementary material for the proof). 
The condition is rather strict, our applications indicate that the CSNs are also suited as computational 
reservoirs when it is violated. This is similar to the situation in standard rate network models [7]. We 
note that if the system is enslaved by an external signal, the time scale of Xn{t) is largely determined 
by this signal and not anymore by the intrinsic scales of the dynamical system. 

We will now show that spiking neural networks Equation ^ can encode continuous dynamics Equa- 
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tion For this we derive the dynamicai equation of the membrane potentiai ^ from the dynamics 
of x(t) using the coding ruie Equation the dynamicai equation 0 for r„(i) and the ruie that a 
spike is generated whenever Vnit) reaches threshoid '?/ 2 : We first differentiate Equation ^ to eiimi- 
nate Xn{t) from Equation ^ and empioy Equation 0 to eiiminate rn{t). The resuiting expression for 
Vn{t) reads 

N 

Vn{t) = -Xv [Xnit)]_ - Xx [Xn(t)]+ + ^ tanh Q [Xm{t)]^'^ - 0Sn{t) + Xs9rn{t) + (7) 

m=l 

it aiready incorporates the resets of size 6 (cf. the term -Osnit)), they arise since Xn{t) = Vn{t)+6rn{t) 
is continuous and r„(t) increases by one at spike times (thus V must decrease by 6). We now 
eiiminate the occurrences of [xn{t)]_^ and [xn{t)]_. 

For this, we make two assumptions (cf. Fig. 1b) on the Xnit) if they are positive: 

[1] The dynamics of Xn{t) are siow against the synaptic timescaie r*, 

[2] the Xn{t) assume predominantiy vaiues Xnit) > ^/ 2 . 

First we consider the case x„(t) > 0. Since Vn{t) is reset when it reaches its threshoid vaiue 9 / 2 , t4(t) 
is aiways smaiier than s/ 2 . Thus, given 14 (t) > 0 assumption [2] impiies that we can approximate 
Xnit) ^ Ornit), as the contribution of Vnit) is negiigibie because Vnit) < s/ 2 . This stiii hoids if Vnit) 
is negative and its absoiute vaiue is not iarge against s/2. Furthermore, assumption [1] impiies that 
smaiier negative Vnit) cannot co-occur with positive x„(t): r„(t) is positive and in the absence of 
spikes it decays to zero on the synaptic time scaie (Equation 0). When Vnit) < 0, neuron n is not 
spiking anymore. Thus when Vnit) is shrinking towards smaii negative vaiues and r„(i) is decaying 
on a timescaie of r^, Xnit) is aiso decaying on a time-scaie r^. This contradicts assumption [1]. Thus 
when Xnit) > 0, the absoiute magnitude of Vnit) is on the order of s/2. With assumption [2] we can 
thus set Xnit) ^ Ornit), whenever x„(t) > 0, negiecting contributions of size s/ 2 . 

Now we consider Xnit) < 0. This impiies Vnit) < 0 (since aiways r„(i) > 0) as weii as a quick 
decay of r„(i) to zero. When x„(i) assumes vaiues significantiy beiow zero, assumption [1] impiies 
that we have Xnit) k. Vnit) and r„(i) rs 0 , otherwise x„(t) must have changed from iarger positive 
(assumption [2]) to iarger negative vaiues on a timescaie of r^. 

The approximate expressions may be gathered in the repiacements [x„(t)]_^ = Ovnit) and [x„(t)]_ = 
\Vnit)]_. Using these in Equation 0 yieids together with = As (l - 

N 

Vnit) = —Ay [V)i(t)]_ -|- E ^nm tanh ijVmit)) + VfXsVnit) - OSnit) + Ie,nit). (8) 

m=l 

Note that our repiacements aiiowed to eiiminate the bioiogicaiiy impiausibie V-dependencies in the 
interaction term. 

To simpiify the remaining t4(t)-dependence, we additionaiiy assume that 
[2’] Xnit) assumes predominantiy vaiues Xnit) > XvO/i2Xx), 

if Xnit) is positive. This can be stricter than [2] depending on the vaiues of A^; and Ay. For positive 
Xnit), where Ay [xnit)]_ in Equation 0 is zero, XyVnit) has an absoiute magnitude on the order 
of XvO/2 (see the arguments above). [2’] impiies that this is negiigibie against -A^; [xn(t)]+. For 
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negative Xn{t), we still have Xn{t) ^ Vn{t). This means that we may replace -A^/ [xn{t)]_ by \vVn{t) 
in Equation 0. Taken together, under the assumptions [1,2,2’] we may use the replacements 

[Xn(t)]+ ~ Orn{t) 

[Xn{t)]_ ~ Vn{t) (9) 

in Equation 0, which directly yield Equation 0. Note that this also implies rn{t) > ®/2 if the neuron 
is spiking, so during active periods inter-spike-intervals need to be considerably smaller than the 
synaptic time scale. 

Equation 0 implies that the assumptions are justified for suitable parameters: For fixed parameters 
Ts = and 9 of the r-dynamics, we can choose sufficiently small A^;, large Anm and small 7 to 
ensure assumptions [1,2,2’] (cf. the conditions highlighted in the section “Network architecture”). On 
the other hand, for given dynamics Equation 0, we can always find a spiking system which generates 
the dynamics via Equations 0, 0 and 0, and satisfies the assumptions: We only need to choose 
Ts sufficiently small such that [ 1 ] is satisfied and the spike threshold sufficiently small such that [ 2 , 2 ’] 
are satisfied. For the latter, 7 needs to be scaled like 6 to maintain the dynamics of Xn and Vr needs 
to be computed from the expression for A^. Interestingly, we find that also outside the range where 
the assumptions are satisfied, our approaches can still generate good results. 

The recovery current in our model has the same time constant as the slow synaptic current. Indeed, 
experiments indicate that they possess the same characteristic timescales: Timescales for NMDA 
[29] and slow GABAa ESI lUl receptor mediated currents are several tens of milliseconds. Afterde¬ 
polarizations have timescales of several tens of milliseconds as well [Te] [Ml [^- Another 

prominent class of slow inhibitory currents is mediated by GABAb receptors and has time scales of 
one hundred to a few hundreds of milliseconds [36]. We remark that in our model the time constants 
of the afterdepolarization and the synaptic input currents may also be different without changing the 
dynamics: Assume that the synaptic time constant is different from that of the recovery current, but 
still satisfies the conditions that it is large against the inter-spike-intervals when the neuron is spiking 
and small against the timescale of [xn(t)]+. The synaptic current generated by the spike train of 
neuron n will then be approximately continuous and the filtering does not seriously affect its overall 
shape beyond smoothing out the spikes. As a consequence, the synaptic and the recovery currents 
are approximately proportional up to a constant factor that results from the different integrated contri¬ 
bution of individual spikes to them. Rescaling 7 by this factor thus yields dynamics equivalent to the 
one with identical time constants. 

Distributed encoding of continuous dynamics 

In the above-described simple CSNs (CSNs with saturating synapses), each spiking neuron gives 
rise to one nonlinear continuous variable. The resulting condition that the inter-spike-intervals are 
small against the synaptic time constants if the neuron is spiking may in biological neural networks 
be satisfied for bursting or fast spiking neurons with slow synaptic currents. It will be invalid for differ¬ 
ent neurons and synaptic currents. The condition becomes unnecessary when the spiking neurons 
encode continuous variables collectively, i.e. if we partially replace the temporal averaging in r„(t) by 
an ensemble averaging. This can be realized by an extension of the above model, where only a lower. 
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say J-, dimensional combination x(i) of the AT-dimensional vectors V(i) and r(t) is continuous, 


x(t) = LV(i) + Tr{t) 


( 10 ) 


where L and f are J x matrices (note that Equation ([^ is a special case with N = J and diagonal 
matrices L and f). We find that spiking networks with nonlinear dendrites Equation @ can encode 
such a lower dimensional variable x(t). The x(t) satisfy J-dimensional standard equations describing 
non-spiking continuous rate networks used for reservoir computing j4l|5lll], 


x(t) = — Aa;x(t) -|- A tanh (x(t)) + c{t). 


( 11 ) 


We denote the resulting spiking networks as CSNs with nonlinear dendrites. 

The derivation (see Supplementary material for details) generalizes the ideas introduced in refs. (H 
El 13] to the approximation of nonlinear dynamical systems: We assume an approximate decoding 
equation (cf. also Equation ^), 


x(t) Ri rr(t) 


( 12 ) 


where r is a J x iV decoding matrix and employ an optimization scheme that minimizes the decoding 
error resulting from Equation at each time point. This yields the condition that a spike should 
be generated when a linear combination of x(t) and r{t) exceeds some constant value. We interpret 
this linear combination as membrane potential V(i). Solving for x(i) gives L and f in terms of r 
in Equation ( [ST| . Taking the temporal derivative yields V(t), first in terms of x(t) and r(t) and after 
replacing them via Equations in terms of x(i), r{t) and s(i). We then eliminate x(t) using ([T^ 

and add a membrane potential leak term for biological realism and increased stability of numerical 
simulations. This yields Equation @ together with the optimal values of the parameters given in 
Table 1. We note that the difference to the derivation in ref. [3] is the use of a nonlinear equation 
when replacing ±{t). We further note that the spiking approximation of the continuous dynamics 
becomes exact, if in the last step x(i) is eliminated using Equation ) [ST| ) and the leak term is omitted 
as it does not arise from the formalism in contrast to the case of CSNs with saturating synapses. 
Like in CSNs with saturating synapses, using the approximated decoding Equation eliminates 
the biologically implausible V-dependencies in the interaction terms. For an illustration of this coding 
see Fig. Id. 

Learning universal computations 

Recurrent continuous rate networks are a powerful means for learning of various kinds of compu¬ 
tations, like steering of movements and processing of sequences 0] Ej. For this, an input and/or 
an output feedback signal needs to be able to “enslave” the network’s high-dimensional dynamics 
|Zl E8]. This means that at any point in time the network’s state is a deterministic function of the 
recent history of input and feedback signals. The function needs to be high dimensional, nonlinear, 
and possess fading memory. A standard model generating suitable dynamics are continuous rate 
networks of the form Equation ([jl). Due to the typically assumed random recurrent connectivity, 
each neuron acts as a randomly chosen, nonlinear function with fading memory. Linearly combin¬ 
ing them like basis functions by a linear readout can approximate arbitrary, nonlinear functions with 
fading memory (time-scales are limited by the memory of the network), and in this sense universal 
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computations on the input and the feedback. The feedback can prolong the fading memory and allow 
to generate self-contained dynamical systems and output sequences 013115] [3^. The feedback can 
be incorporated into the network by directly training the recurrent synaptic weights |5.[38l. 

Our understanding of the complex spiking dynamics of CSNs in terms of nonlinear first order differ¬ 
ential equations enables us to apply the above theory to spiking neural networks: In the first step, 
we were able to conclude that our CSNs can generate enslaveable and thus computationally useful 
dynamics as they can be decoded to continuous dynamics that possess this property. In the second 
step, we have to ask which and how output signals should be learned to match a desired signal: 
In a biological setting, the appropriate signals are the sums of synaptic or dendritic input currents 
that spike trains generate, since these affect the somata of postsynaptic neurons as well as effectors 
such as muscles [39]. To perform, e.g., a desired continuous movement, they have to prescribe the 
appropriate muscle contraction strengths. For both CSNs with saturating synapses and with nonlin¬ 
ear dendrites, we choose the outputs to have the same form as the recurrent inputs that a soma of 
a neuron within the CSN receives. Accordingly, in our CSNs with saturating synapses, we interpret 
sums of the postsynaptic currents 


N N 

Zkit) = wl^tanh {-frm{t)) =: Y (13) 

m=l m=l 

as output signals, where the index k distinguishes iTout different outputs, and are the learnable 
synaptic output weights. For networks with nonlinear dendrites the outputs are a linear combination 
of inputs preprocessed by nonlinear dendrites 

Zk{t) = Y '^kj tanh ( Y ^jmrmit) J =: Y '^kj^jit), (14) 

j=l \m=l / j=l 

where the strengths of the dendro-somatic coupling are learned [40J. The networks can now 
learn the output weights such that Zk{t) imitates a target signal Fk{t), using standard learning rules 
for linear readouts (see Fig. 2a for an illustration). We employ the recursive least squares method 

14TJ- 

To increase the computational and learning abilities, the output signals should be fed back to the 
network as an (additional) input (Fig. 2b) 

-^OUt -f^OUt 

= Y.AkY.^kprpit), (15) 

k=l k=l p 

where each neuron receives a linear combination of the output signals zk{t) with static feedback 
connection strengths Here and in the following Greek letter indices such as /3,p range over 
all saturating synapses {/3,p = fp{t) = tanh( 7 r/ 9 (t))) in CSNs with saturating synapses, or 

over all nonlinear dendrites (/3,p = 1,..., J; rp{t) = tanh in CSNs with nonlinear 

dendrites. 

It often seems biologically more plausible not to assume a strong feedback loop that enslaves the 
recurrent network, but rather to train recurrent weights. Our CSNs allow for this (Fig. 2c): We can 
transform the learning of output weights in networks with feedback into mathematically equivalent 
learning of recurrent connection strengths, between synapses (CSNs with saturating synapses) or 
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Figure 2: Setups used to learn versatile nonlinear computations with spiking neural networks, (a) A static con¬ 
tinuous signal coding spiking neural network fCSN, gray shaded) serves as a spiking computationai reservoir with high 
signai-to-noise ratio. The resuits of computations on current and past externai inputs le can be extracted by simpie neuron- 
iike readouts. These iineariy combine somatic inputs generated by saturating synapses or noniinear dendrites, f (red), to 
output signais z (Equations (73|[7^| The output weights w° are iearned suoh that z approximates the desired continuous 


target signais. (b) Plastic continuous signal coding spiking neural networks (PCSNs) possess a ioop that feeds the outputs 
z back via static connections as an additionai input (biue, Equation\l^. Suoh networks have increased computationai 
capabiiities aiiowing them to, e.g., generate desired seif-sustained activity, (c) The feedback ioop can be incorporated into 
the recurrent network via piastic reourrent connections (red in gray shaded area). 


dendrites (CSNs with nonlinear dendrites) and the soma [40] (we learn Anm, see Methods for details 
of the implementation). We note that approximating different dynamical systems, e.g. ones equivalent 
to Equation ([jl) but with the coupling matrix inside the nonlinearity [42], may also in CSNs with 
nonlinear dendrites allow to learn synaptic weights in similar manner. We call CSNs with learning of 
outputs in presence of feedback, or with learning of recurrent connections plastic continuous signal 
coding spiking neural networks (PCSNs). 

To learn feedback and recurrent connections, we use the FORCE imitation learning rule, which has 
recently been suggested for networks of continuous rate neurons |5i[38]: We use fast online learning 
based on the recursive least squares rule of the output weights in order to ensure that the output of 
the network is similar to the desired output at all times. Since during training the output is ensured to 
be close to the desired one, it can be used as feedback to the network at all times. The remaining 
deviations from the desired output are expected to be particularly suited as training noise as they 
reflect the system’s inherent noise. As mentioned before, the feedback loop may be incorporated in 
the recurrent network connectivity. During training, the reservoir connections are then learned in a 
similar manner as the readout. 

In the following, we show that our approach allows spiking neural networks to perform a broad variety 
of tasks. In particular, we show learning of desired self-sustained dynamics at a degree of difficulty 
that has, to our knowledge, previously only been accessible with continuous rate networks. 


Applications 

Self-sustained pattern generation 

Animals including humans can learn a great variety of movements, from periodic patterns like gait 
or swimming, to much more complex ones like producing speech, generating chaotic locomotion 
g3l[44] or playing the piano. Moreover when an animal learns to use an object (Fig. 3a), it has to 
learn the dynamical properties of the object as well as how its body behaves when interacting with 
it. Especially for complex, non-periodic dynamics, a dynamical system has to be learned with high 
precision. 

How are spiking neural networks able to learn dynamical systems, store them and replay their activ¬ 
ity? We find that PCSNs may solve the problem. They are able to learn periodic patterns of different 
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degree of complexity as well as chaotic dynamical systems by imitation learning. Fig. 3 illustrates this 
for PCSNs with nonlinear synapses (Fig. 3d,e) and with nonlinear dendrites (Fig. 3b,e,f). 

The figure displays the recall of three periodic movements after learning: a sine wave, a more com¬ 
plicated non-differentiable saw tooth pattern and a “camel’s hump” superposition of sine and cosine. 
Also for long simulation times, we find no deviation from the displayed dynamics except for an in¬ 
evitable phase shift (Fig. Ga in SI Supporting Information). It results from accumulation of small 
differences between the learned and desired periods. Apart from this, the error between the recalled 
and the desired signals is approximately constant over time (Fig. Gb in SI Supporting Information). 
This indicates that the network has learned a stable periodic orbit to generate the desired dynamics, 
the orbit is sufficiently stable to withstand the intrinsic noise of the system. Fig. 3 furthermore illus¬ 
trates learning of a chaotic dynamical system. Here, the network learns to generate the time varying 
dynamics of all three components of the Lorenz system and produces the characteristic attractor pat¬ 
tern after learning (Fig. 3f). Due to the encoding of the dynamics in spike trains, the signal maintains 
a small deterministic error which emerges from the encoding of a continuous signal by discrete spikes 
(Fig. 3g). The individual training and recall trajectories quickly depart from each other after the end of 
learning since they are chaotic. However, also for long simulation times, we observe qualitatively the 
same dynamics, indicating that the correct dynamical system was learned (Fig. Gc in SI Supporting 
Information). Occasionally, errors occur, cf. the larger loop in Fig. 3f. This is to be expected due to the 
relatively short training period, during which only a part of the phase space covered by the attractor 
is visited. Importantly, we observe that after errors the dynamics return to the desired ones indicating 
that the general stability property of the attractor is captured by the learned system. To further test 
these observations, we considered a not explicitly trained long-term feature of the Lorenz-dynamics, 
namely the tent-map which relates the height Zn-i of the (n- l)th local maximum in the ^;-coordinate, 
to the height Zn of the subsequent local maximum. The spiking network indeed generates the map 
(Fig. 3h), with two outlier points corresponding to each error. 

In networks with saturating synapses, the spike trains are characterized by possibly intermittent peri¬ 
ods of rather high-frequency spiking. In networks with nonlinear dendrites, the spike trains can have 
low frequencies and they are highly irregular (Figs. 3c, F in SI Supporting Information). In agree¬ 
ment with experimental observations (e.g. [45]), the neurons can have preferred parts of the encoded 
signal in which they spike with increased rates. 

The dynamics of the PCSNs and the generation of the desired signal are robust against dynamic and 
structural perturbations. They sustain noise inputs which would accumulate to several ten percent of 
the level of the threshold within the membrane time constant, for a neuron without further input (Fig. B 
in SI Supporting Information). For larger deviations of Wnjm from their optimal values, PCSNs with 
nonlinear dendrites can keep their learning capabilities, if /i is tuned to a specific range. Outside 
this range, the capabilities break down at small deviations (Fig. C in SI Supporting Information). 
However, a slightly modified version of the models, where the reset is always to -9 (even if there 
was fast excitation that drove the neuron to spike by a suprathreshold input), has a high degree of 
robustness against such structural perturbations. We also checked that the fast connections are 
important, albeit substantial weakening can be tolerated (Fig. D in SI Supporting Information). 

The deterministic spike code of our PCSNs encodes the output signal much more precisely than 
neurons generating a simple Poisson code, which facilitates learning. We have quantified this using 
a comparison between PCSNs with saturating synapses and networks of Poisson neurons of equal 
size, both learning the saw tooth pattern in the same manner. Since both codes become more precise 
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Figure 3: Learning dynamics with spiking neurai networks, (a): Schematic hunting scene, illustrating the need for 
complicated dynamicai systems learning and control. The homlnid has to predict the motion of its prey, and to predict and 
controi the movements of its body and the projectiie. (b-h): Learning of self-sustained dynamical patterns by spiking neural 
networks, (b): A sine wave generated by summed, synapticaily and dendriticaliy filtered output spike trains of a PCSN 
with nonlinear dendrites, (c): A sample of the network’s spike trains generating the sine in (b). (d): A saw tooth pattern 
generated by a PCSN with saturating synapses, (e): A more complicated smooth pattern generated by both architectures 
(blue: nonlinear dendrites, red: saturating synapses), (f-h): Learning of chaotic dynamics (Lorenz system), with a PCSN 
with nonlinear dendrites, (f): The spiking network imitates an example trajectory of the Lorenz system during training (blue); 
it continues generating the dynamics during testing (red), (g): Detailed view of (f) highlighting how the exampte trajectory 
(yellow) Is Imitated during training and continued during testing, (h): The spiking network approximates not explicitly trained 
quantitative dynamical features, like the tent map between subsequent maxima of the z-coordinate. The ideai tent map 
(yellow) Is closely approximated by the tent map generated by the PCSN (red). The spiking network sporadically generates 
errors, of the larger loop In (f) and the outlier points in (h). Panel (h) shows a ten times longer time series than (f), with 
three errors. 


with increasing spike rate of individual neurons, we compared the testing error between networks 
with equal spike rates. Due to their higher signal-to-noise ratio, firing rates required by the PCSNs to 
achieve the same pattern generation quality are more than one order of magnitude lower (Fig. A in 
SI Supporting Information). 


Delayed reaction/time interval estimation 

For many tasks, e.g. computations focusing on recent external input and generation of self-sustained 
patterns, it is essential that the memory of the involved recurrent networks is fading: If past states 
cannot be forgotten, they lead to different states in response to similar recent inputs. A readout that 
learns to extract computations on recent input will then quickly reach its capacity limit. In neural 
networks, fading memory originates on the one hand from the dynamics of single neurons, e.g. due 
to their finite synaptic and membrane time constants; on the other hand it is a consequence of the 
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neurons’ connection to a network @6] [471 [48]. In standard spiking neural network models, the overall 
fading memory is short, of the order of hundreds of milliseconds (49l|9l[10l|T3]. It is a matter of current 
debate how this can be extended by suitable single neuron properties and topology (H |3l [50l [STI . 
Many biological computations, e.g. the simple understanding of a sentence, require longer memory, 
on the order of seconds. 

We find that CSNs without learning of recurrent connectivity or feedback access such time scales. 
We illustrate this by means of a delayed reaction/time estimation task: In the beginning of a trial, the 
network receives a short input pulse. By imitation learning, the network output learns to generate 
a desired delayed reaction. For this, it needs to specifically amplify the input’s dynamical trace in 
the recurrent spiking activity, at a certain time interval. The desired response is a Gaussian curve, 
representative for any type of delayed reaction. The reaction can be generated several seconds after 
the input (Fig. 4a-c). 

The quality of the reaction pattern depends on the connection strengths within the network, specified 
by the spectral radius g of the coupling matrix divided by the leak of a single corresponding contin¬ 
uous unit Aa;. Memory is kept best in an intermediate regime (Fig. 4b), where the CSN stays active 
over long periods of time without overwriting information. This has also been observed for contin¬ 
uous rate networks [52]. For too weak connections (Fig. 4a), the CSN returns to the inactive state 
after short time, rendering it impossible to retrieve input information later. If the connections are too 
strong, (Fig. 4c), the CSN generates self-sustained, either irregular asynchronous or oscillating ac¬ 
tivity, partly overwriting information and hindering its retrieval. We observe that already the memory 
in disconnected CSNs with synaptic saturation can last for times beyond hundreds of milliseconds 
(cf. Fig. E in SI Supporting Information). This is a consequence of the recovery current: If a neuron 
has spiked several times in succession, the accumulated recovery current leads to further spiking 
(and further recovery current), and thus dampens the decay of a strong activation of the neuron [53]. 

Experiments show that during time estimation tasks, neurons are particularly active at two times: 
When the stimulus is received and when the estimated time has passed (541 Often the neu¬ 
ron populations that show activity at these points are disjoint. Our model reproduces this behavior 
for networks with good memory performance. In particular, at the time of the initial input the recur¬ 
rently connected neurons become highly active (gray traces in Fig. 4b, upper sub-panel) while at the 
estimated reaction time, readout neurons would show increased activity (red trace). 

Persistent memory and context dependent switching 

Tasks often also require to store memories persistently, e.g. to remember instructions [56]. Such 
memories may be maintained in learned attractor states (e.g. [57] ^ [60]). In the framework 

of our computing scheme, this requires the presence of output feedback [3]. Here, we illustrate the 
ability of PCSNs to learn and maintain persistent memories as attractor states as well as the ability to 
change behavior according to them. For this, we use a task that requires memorizing computational 
instructions (Fig. 4d) [3J. The network has two types of inputs: After pulses in the instruction channels, 
it needs to switch persistently between different rules for computation on the current values of operand 
channels. To store persistent memory, the recurrent connections are trained such that an appropriate 
output can indicate the instruction channel that has sent the last pulse: The network learns to largely 
ignore the signal when a pulse arrives from the already remembered instruction channel, and to 
switch states otherwise. Due to the high signal-to-noise ratio of our deterministic spike code, the 
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Figure 4: Learning of ionger-term memory dependent computations with spiking neurai networks, (a-c): Delayed 
reaction and time interval estimation: The synaptic output of a CSN learns to generate a generic reaction several seconds 
after a short input. Upper panels show typical examples of input, desired and actual reactions (green, yellow and red 
traces). In the three panels, the desired reaction delay is the same (9sec), the networks (CSNs with saturating synapses) 
have different levels of recurrent connection strengths ((a), (b), (c): low, intermediate, high level). The generation of the 
reaction is best for the network with intermediate level of connection strength. The CSNs with lower or higher levels have 
not maintained sufficient memory due to their extinguished or noisy and likely chaotic dynamics (gray background lines: 
spike rates of individual neurons). The median errors of responses measured for different delays in ensembles of networks 
(levels of connection strength as in the upper panels), are given in the lower panels. The shaded regions represent the 
area between the first and third quartile of the response errors. Dashed lines highlight delay and error size of the examples 
in the upper panels, (d): Persistent retaining of instructions and switching between computations: The network receives (i) 
two random continuous operand inputs (upper sub-panel, yellow and purple traces), and (ii) two pulsed instruction inputs 
(middle sub-panel, blue and green; memory of last instruction pulse: red). The network has learned to perform different 
computations on the operand inputs, depending on the last instruction (lower subpanel): if it was +1 (triggered by instruction 
channel 1), the network performs a nonlinear computation, it outputs the absolute value of the difference of the operands 
(red trace (network output) agrees with blue); if it was -1 (triggered by channel 2), the values of the operands are added 
(red trace agrees with green trace). 
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PCSNs are able to keep a very accurate representation of the currently valid instruction in their 
recurrent dynamics. Fig. 4d, middle sub-panel, shows this by displaying the output of the linear 
readout trained to extract this instruction from the network dynamics. A similarly high precision can 
be observed for the output of the computational task, cf. Fig. 4d, lower sub-panel. 

Building of world models, and control 

In order to control its environment, an animal has to learn the laws that govern the environment’s dy¬ 
namics, and to develop a control strategy. Since environments are partly unpredictable and strategies 
are subject to evolutionary pressure, we expect that they may be described by stochastic optimal con¬ 
trol theory. A particularly promising candidate framework is path integral control, since it computes 
the optimal control by simulating possible future scenarios under different random exploratory con¬ 
trols, and the optimal control is a simple weighted average of them [0 ]. For this, an animal needs 
an internal model of the system or tool it wants to act on. It can then mentally simulate different 
ways to deal with the system and compute an optimal one. Recent experiments indicate that animals 
indeed conduct thought experiments exploring and evaluating possible future actions and movement 
trajectories before performing one f62l[63l . 

Here we show that by imitation learning, spiking neural networks, more precisely PCSNs with a 
feedback loop, can acquire an internal model of a dynamical system and that this can be used to 
compute optimal controls and actions. As a specific, representative task, we choose to learn and 
control a stochastic pendulum (Fig. 5a,b). The pendulum’s dynamics are given by 

0(t) -I- cxJo^{t) -I- coq sm{(f>{t)) = ^{t) + u{t), (16) 

with the angular displacement </> relative to the direction of gravitational acceleration, the undamped 
angular frequency for small amplitudes wo, the damping ratio c, a random (white noise) angular force 
^{t) and the deterministic control angular force u{t), both applied to the pivot axis. The PCSN needs 
to learn the pendulum’s dynamics under largely arbitrary external control forces; this goes beyond the 
tasks of the previous sections. It is achieved during an initial learning phase characterized by motor 
babbling as observed in infants 164] and similarly in bird song learning [65]: During this phase, there 
is no deterministic control, u = 0, and the pendulum is driven by a random exploratory force ^ only. 
Also the PCSN receives ^ as input and learns to imitate the resulting pendulum’s dynamics with its 
output. 

During the subsequent control phase starting al t = 0, the aim is to swing the pendulum up and 
hold it in the inverted position (Fig. 5c). For this, the PCSN simulates at time t a set of M future 
trajectories of the pendulum, for different random exploratory forces (“mental exploration” with u = 
0, cf. Fig. 5a,b), starting with the current state of the pendulum. In a biological system, the initialization 
may be achieved through sensory input taking advantage of the fact that an appropriately initialized 
output enslaves the network through the feedback. Experiments indicate that explored trajectories 
are evaluated, by brain regions separate from the ones storing the world model USESZllIS]. We thus 
assign to the simulated trajectories a reward Ri measuring the agreement of the predicted states 
with the desired ones. The optimal control u{t -h s) (cf. Equation ([T^) for a subsequent, not too large 
time interval s g [0,5] is then approximately given by a temporal average over the initial phase of the 
assumed random forces, weighted by the exponentiated total expected reward. 
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where ^i{t) = | ii{i)di and Ac is a weighting factor. We have chosen Ri{t) = yi{i)di, i.e. the 
expected reward increases linearly with the heights yi{i) = - cos{(j)i{i)) predicted for the pendulum 
for input trajectory it becomes maximal for a trajectory at the inversion point. Tr is the duration of 
a simulated trajectory. The optimal control is applied to the pendulum until t + A, with A < 6 . Then, 
al t + A, the PCSN simulates a new set of trajectories starting with the pendulum’s updated state 
and a new optimal control is computed. This is valid and applied to the pendulum between t + A 
and t + 2A, and so on. We find that controlling the pendulum by this principle leads to the desired 
upswing and stabilization in the inversion point, even though we assume that the perturbing noise 
force ^ (Equation ([T^) acting on the pendulum in addition to the deterministic control u, remains as 
strong as it was during the exploration/learning phase (cf. Fig. 5a,b). 


We find that for controlling the pendulum, the learned internal model of the system has to be very 
accurate. This implies that particular realizations of the PCSN can be unsuited to learn the model (we 
observed this for about half of the realizations), a phenomenon that has also been reported for small 
continuous rate networks before. However, we checked that continuous rate networks as encoded 
by our spiking ones reliably learn the task. Since the encoding quality increases with the number of 
spiking neurons, we expect that sufficiently large PCSNs reliably learn the task as well. 


Discussion 

The characteristic means of communication between neurons in the nervous system are spikes. It is 
widely accepted that sequences of spikes form the basis of neural computations in higher animals. 
How computations are performed and learned is, however, largely unclear. Here we have derived 
continuous signal coding spiking neural networks (CSNs), a class of mesoscopic spiking neural net¬ 
works that are a suitable substrate for computation. Together with plasticity rules for their output or 
recurrent connections, they are able to learn general, complicated computations by imitation learn¬ 
ing (plastic CSNs, PCSNs). Learning can be highly reliable and accurate already for comparably 
small networks of hundreds of neurons. The underlying principle is that the networks reflect the in¬ 
put in a complicated nonlinear way, generate nonlinear transformations of it and use fading memory 
such that the inputs and their pasts interfere with each other. This requires an overall nonlinear re¬ 
laxation dynamics suitable for computations [4]. Such dynamics are different from standard spiking 
neural network dynamics, which are characterized by a high level of noise and short intrinsic memory 
191Q01[69][1I]. 

To find spiking networks that generate appropriate dynamics, we use a linear decoding scheme for 
continuous signals encoded in the network dynamics as combinations of membrane potentials and 
synaptic currents. A specific coding scheme like this was introduced in refs. (Hj^ to derive spiking 
networks encoding linear dynamics in an optimal way. We introduce spiking networks where the 
encoded signals have dynamics desirable for computation, i.e. a nonlinear, high-dimensional, low- 
noise, relaxational character as well as significant fading memory. We conclude that, since we use 
simple linear decoding, already the dynamics of the spiking networks must possess these properties. 

Using this approach, we study two types of CSNs: Networks with saturating synapses and networks 
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Figure 5: Model building and mental exploration to compute optimal control, (a): Learning of an internal world 
model with spiking neural networks. During model building, random exploratory control drives the dynamical system (here: 
a swinging pendulum). The spiking neural network is provided with the same control as input and learns to mimic the 
behavior of the pendulum as its output, (b): After learning, the spiking network can simulate the system’s response to 
control signals. The panel displays the height of the real pendulum in the past (solid black line) and future heights under 
different exploratory controls (dashed lines). For the same controls, the spiking neural network predicts very similar future 
positions (colored lines) as the imitated system. It can therefore be used for mental exploration and computation of optimal 
control to reach an aim, here: to invert the pendulum, (c): During mental exploration, the network simulates in regular time 
intervals a set of possible future trajectories for different controls, starting from the actual state of the pendulum. From this, 
the optimal control until the next exploration can be computed and applied to the pendulum. The control reaches its aim: 
The pendulum is swung up and held in inverted position, despite a high level of noise added during testing (uncontrolled 
dynamics as in panel (a)). 
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with nonlinear dendrites. The CSNs with saturating synapses use a direct signal encoding; each 
neuron codes for one continuous variable. It requires spiking dynamics characterized by possibly 
intermittent phases of high rate spiking, or bursting, with inter-spike-intervals smaller than the synaptic 
time constants, which leads to a temporal averaging over spikes. Dynamics that appear externally 
similar to such dynamics were recently highlighted as a ‘second type of balanced state’ in networks 
of pulse-coupled, intrinsically oscillating model neurons [51]. Very recently ^ [Til showed that 
networks whose spiking dynamics are temporally averaged due to slow synapses possess a phase 
transition from a fixed point to chaotic dynamics in the firing rates, like the corresponding rate models 
that they directly encode. In the analytical computations the spike coding was not specified [70] 
or assumed to be Poissonian [7T]. Numerical simulations of leaky integrate-and-fire neurons in the 
chaotic rate regime can generate intermittent phases of rather regular high-rate spiking [70]. The 
networks might provide a suitable substrate for learning computations as well. However, since the 
chaotic rate dynamics have correlations on the time scale of the slow synapses its applicability is 
limited to learning tasks where only a short fading memory of the reservoir is needed. For example 
delayed reaction tasks as illustrated in Fig. 4a-c would not be possible. Interestingly, in our scheme 
a standard leaky integrate-and-fire neuron with saturating synapses appears as a special case with 
recovery current of amplitude zero. According to our analysis it can act as a leaky integrator with 
a leak of the same time constant as the synapses, = A<j. In contrast, in presence of a recovery 
current, our networks with saturating synapses can encode slower dynamics on the order of seconds. 
After training the network, the time scales can be further extended. 

In the CSNs with nonlinear dendrites the entire neural population codes for a usually smaller num¬ 
ber of continuous variables, avoiding high firing rates in sufficiently large networks. The networks 
generate irregular, low frequency spiking and simultaneously a noise-reduced encoding of nonlin¬ 
ear dynamics, the temporal averaging over spikes in the direct coding case is partially replaced by 
a spatial averaging over spike trains from many neurons. The population coding scheme and our 
derivations of CSNs with nonlinear dendrites generalize the predictive coding proposed in ref. [3] to 
nonlinear dynamics. The roles of our slow and fast connections are similar to those used there: In 
particular, redundancies in the spiking are eliminated by fast recurrent connections without synaptic 
filtering. We expect that these couplings can be replaced by fast connections that have small finite 
synaptic time constants, as shown for the networks of ref. [3] in ref. [72]. In contrast to previous 
work, in the CSNs with nonlinear dendrites we have linear and nonlinear slow couplings. The former 
contribute to coding precision and implement linear parts of the encoded dynamics, the latter imple¬ 
ment the nonlinearities in the encoded dynamics. Further, in contrast to previous work, the spike 
coding networks provide only the substrate for learning of general dynamical systems by adapting 
their recurrent connections. Importantly, this implies (i) that the neurons do not have to adapt their 
nonlinearities to each nonlinear dynamical system that is to be learned (which would not seem biolog¬ 
ically plausible) and (ii) that the CSNs do not have to provide a faithful approximation of the nonlinear 
dynamics Equations (1^,([TT), since a rough dynamical character (i.e. slow dynamics and the echo 
state property) is sufficient for serving as substrates. We note that refs. {7^ |74l suggested to use 
the differential equations that characterize dynamical systems to engineer spiking neural networks 
that encode the dynamics. The approach suggests an alternative derivation of spiking networks that 
may be suitable as substrate for learning computations. Their rate coding scheme, however, allows 
for redundancy and thus higher noise levels, and it generates high frequency spiking. In a future 
publication, B. DePasquale, M. Churchland, and L.F. Abbott will present an approach to train rate 
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coding spiking neural networks, with continuous rate networks providing the target signals [75]. We 
will discuss the relation between our and this approach in a joint review [76]. 

A characteristic feature of our neuron models is that they take into account nonlinearities in the 
synapses or in the dendrites. On the one hand this is biologically plausible (TSl[THIlO]|211, on the 
other hand it is important for generating nonlinear computations. Our nonlinearities are such that 
the decoded continuous dynamics match those for typical networks of continuous rate neurons and 
provide a simple model for dendritic and synaptic saturation. However, the precise form of the neuron 
model and its nonlinearity is not important for our approaches: As long as the encoded dynam¬ 
ical system is suitable as a computational reservoir, the spiking system is a CSN and our learning 
schemes will work. As an example, a dendritic tree with multiple interacting compartments may be di¬ 
rectly implemented in both the networks with saturating synapses and in the networks with nonlinear 
dendrites. A future task is to explore the computational capabilities of CSNs incorporating different 
and biologically more detailed features that lead to nonlinearities, e.g. neural refractory periods, den¬ 
dritic trees with calcium and NMDA voltage dependent channels and/or standard types of short term 
synaptic plasticity. 

Inspired by animals’ needs to generate and predict continuous dynamics such as their own body and 
external world movements, we let our networks learn to approximate desired continuous dynamics. 
Since effector organs such as muscles and post-synaptic neurons react to weighted, possibly dendrit- 
ically processed sums of post-synaptic currents, we interpret these sums as the relevant, continuous 
signal-approximating outputs of the network [39]. Importantly, this is not the same as Poissonian rate 
coding of a continuous signal: As a simple example, consider a single spiking neuron. In our scheme 
it will spike with constant inter-spike-intervals to encode a constant output. In Poissonian rate coding, 
the inter-spike-intervals will be random, exponentially distributed and many more spikes need to be 
sampled to decode the constant output (cf. Fig. A in S1 Supporting Information). 

The outputs and recurrent connections of CSNs can be learned by standard learning rules |4T[[5|. 
The weight changes depend on the product of the error and the synaptic or dendritic currents and may 
be interpreted as delta-rules with synapse- and time-dependent learning rates. PCSNs, with learning 
of recurrent weights or output feedback, show how spiking neural networks may learn internal models 
of complicated, self-sustained environmental dynamics. In our applications, we demonstrate that they 
can learn to generate and predict the dynamics in different depths, ranging from the learning of single 
stable patterns over the learning of chaotic dynamics to the learning of dynamics incorporating their 
reactions to external influences. 

The spiking networks we use have medium size, like networks with continuous neurons used in 
the literature fH |5]. CSNs with saturating synapses have, by construction, the same size as their 
non-spiking counterparts. In CSNs with nonlinear dendrites the spike load necessary to encode the 
continuous signals is distributed over the entire network. This leads to a trade-off between lower 
spiking frequency per neuron and larger network size (cf. Fig. F in SI Supporting Information): The 
faster the neurons can spike the smaller the network may be to solve a given task. 

Previous work using spiking neurons as a reservoir to generate a high dimensional, nonlinear projec¬ 
tion of a signal for computation, concentrated on networks without output feedback or equivalent task- 
specific learning of recurrent connectivity (UlZZli^. Such networks are commonly called “liquid state 
machines” [78]. By construction, they are unable to solve tasks like the generation of self-sustained 
activity and persistent memorizing of instructions; these require an effective output feedback, since 
the current output determines the desired future one: To compute the latter, the former must be made 
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available to the network as an input. The implementation of spiking reservoir computers with feed¬ 
back was hindered by the high level of noise in the relevant signals: The computations depend on the 
spike rate, the spike trains provide a too noisy approximation of this average signal and the noise is 
amplified in the feedback loop. While analytically considering feedback in networks of continuous rate 
neurons, ref. [3] showed examples of input-output tasks solved by spiking networks with a feedback 
circuit, the output signals are affected by a high level of noise. This concerns even output signals just 
keeping a constant value. We implemented similar tasks (Fig. 4d), and find that our networks solve 
them very accurately due to their more efficient coding and the resulting comparably high signal-to- 
noise ratio. In contrast to previous work, our derivations systematically delineate spiking networks 
which are suitable for the computational principle with feedback or recurrent learning; the networks 
can accurately learn universal, complicated memory dependent computations as well as dynamical 
systems approximation, in particular the generation of self-sustained dynamics. 

In the control task, we show how a spiking neural network can learn an internal model of a dynamical 
system, which subsequently allows to control the system. We use a path integral approach, which 
has already previously been suggested as a theory for motor control in biological systems 179] [80l . 
We apply it to learned world models, and to neural networks. Path integral control assumes that noise 
and control act in a similar way on the system [^]. This assumption is comparably weak and the 
path integral control method has been successfully applied in many robotics applications lFf][8^f83l , 
where it was found to be superior to reinforcement learning and adaptive control methods. 
Continuous rate networks using recurrence, readouts, and feedback or equivalent recurrent learning, 
are versatile, powerful devices for nonlinear computations. This has inspired their use in manifold 
applications in science and engineering, such as control, forecasting and pattern recognition [6]. 
Our study has demonstrated that it is possible to obtain similar performance using spiking neural 
networks. Therewith, our study makes spiking neural networks available for similarly diverse, complex 
computations and supports the feasibility of the considered computational principle as a principle for 
information processing in the brain. 


Methods 

Network simulation 

We use a time grid based simulation scheme (step size dt). If not mentioned otherwise, between time 
points, we compute the membrane potentials using a Runge-Kutta integration scheme for dynamics 
without noise and an Euler-Maruyama integration scheme for dynamics with noise. Since CSNs 
with nonlinear dendrites have fast connections without conduction delays and synaptic filtering, we 
process spikings at a time point as follows: We test whether the neuron with the highest membrane 
potential is above threshold. If the outcome is positive, the neuron is reset and the impact of the 
spike on postsynaptic neurons is evaluated. Thereafter, we compute the neuron with the highest, 
possibly updated, membrane potential and repeat the procedure. If all neurons have subthreshold 
membrane potential, we proceed to the next time point. The described consecutive updating of 
neurons in a single time step increases in networks with nonlinear dendrites the robustness of the 
simulations against larger time steps, as the neurons maintain an order of spiking and responding like 
in a simulation with smaller time steps and a small but finite conduction delay and/or slight filtering 
of fast inputs. As an example, the scheme avoids that neurons that code for similar features and 
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thus possess fast mutual inhibition, spike together within one step and generate an overshoot in the 
readout, as it would be the case in a parallel membrane potential updating scheme. The different 
tasks use either networks with saturating synapses or networks with nonlinear dendrites. In both 

cases, A is a sparse matrix with a fraction p of non-zero values. These are drawn independently 

2 

from a Gaussian distribution with zero mean and variance ^ (CSNs with saturating synapses) or 
^ (CSNs with nonlinear dendrites), which sets the spectral radius of A approximately to g. For 
networks with nonlinear dendrites, the elements of r are drawn from a standard normal distribution. 
To keep the approach simple, we allow for positive and negative dendro-somatic couplings. In order 
to achieve a uniform distribution of spiking over the neurons in the network, we normalize the columns 
of r to have the same norm, which we control with the parameter 7 ^. This implies that the thresholds 
are identical. 

Training phase 

The networks are trained for a period of length Tt such that the readouts Zk imitate target signals 
Fk{t), i.e. such that the time average of the square of the errors ek{t) = zk{t) - Fk{t) is minimized. 
At Tt, training stops and the weights are not updated anymore in the subsequent testing. If present, 
the external input to the neurons is a weighted sum of K-^ continuous input signals fk{t), Ie,i 3 {t) = 
Ylk=i'^i 3 kfk{t), where the index (5 runs from 1 to A (CSNs with saturating synapses) or from 1 to J 
(CSNs with nonlinear dendrites). The weights w^k are fixed and drawn from a uniform distribution 
in the range [-w\w'']. If present, the feedback weights (cf. Equation (Q^) are likewise chosen 
randomly from a uniform distribution in the range [-w-l',w^ with a global feedback parameter w^f. 

For the delayed reaction/time estimation task (Figs. 4a-c, E in SI Supporting Information), we ap¬ 
plied the RLS (recursive least squares) algorithm [El to learn the linear outputs. For the pattern 
generation, instruction switching and control tasks, we applied the FORCE (first-order reduced and 
controlled error) algorithm [5] (Figs. 3, 4d, 5, A-D, F and G in SI Supporting Information) to learn the 
recurrent connections and linear outputs. 

Learning rules 

The output weights are trained using the standard recursive least squares method [El- They are 
initialized with 0 , we use weight update intervals of At. The weight update uses the current training 
error ek{t) = zk{t) - Fk{t), where zkit) is the output that should imitate the target signal Fk{t), it 
further uses an estimate T/ 3 p(f) of the inverse correlation matrix of the unweighted neural synaptic or 
dendritic inputs r^(t), as well as these inputs. 


w°kfiit) = wlp{t - At) - ek{t)'^Piip{t)rp{t). (18) 

p 

The indices /3,p range over all saturating synapses {I3,p = 1 ,..., A; f^(t) = tanh( 7 r^(i))) or all non¬ 
linear dendrites {I3,p = rpit) = tanh(^Y)m=i^i 3 mrmit)^) of the output neuron. The square 

matrix P is a running filter estimate of the inverse correlation matrix of the activity of the saturated 
synapses (CSNs with saturating synapses) or non-linear dendrites (CSNs with nonlinear dendrites). 
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The matrix is updated via 


Ep Ha Pfipit - At)rp{t)ra{t)Pa^{t - At) 

1 + T.plla^p{t)Ppo{t - At)?v(t) 


-P/ 37 W = P0i{t - At) - 


(19) 


where the indices /3, 7 ,^, 0 - run from 1 to (CSNs with saturating synapses) or from 1 to J (CSNs 
with nonlinear dendrites). P is initialized as P(0) = a~^l with acting as a learning rate. 

For the update of output weights in presence of feedback and of recurrent weights we adopt the 
FORCE algorithm [5]. In presence of feedback, this means that recursive least squares learning of 
output is fast against the temporal evolution of the network, and already during training the output is 
fed back into the network. Thus, each neuron gets a feedback input 

-^OUt -^out 

= Y^kY ^kprpit). ( 20 ) 

k=l k=l p 

The feedback weights are static, the output weights are learned according to Equation ffs] ). 
Since the outputs are linear combinations of synaptic or dendritic currents, which also the neurons 
within the network linearly combine, the feedback loop can be implemented by modifying the recurrent 
connectivity, by adding a term J2k=i ^^e matrix Ap^. Learning then affects the output 
weights as well as the recurrent connections, separate feedback connections are not present. This 
learning and learning of output weights with a feedback loop are just two different interpretations of 
the same learning rule. For networks with saturating synapses the update is 

i^out ^ 

Anmit) = Anm{t “ Ai) - ^ wl^^ek{t) ^ Pmlit)ri{t), (21) 

k=l 1=1 

where the ^ik are now acting as learning rates. For networks with nonlinear dendrites, the update is 

J Kout J 

Dnj{t) = Dnj{t - At) - ^ Ti^ ^ w{f.ek{t) ^ Pjh{t)rh{t). (22) 

i=\ k=l h=l 


Control task 

The task is achieved in two phases, the learning and the control phase. 

1. Learning: The PCSN learns a world model of the noisy pendulum, i.e. it learns the dynamical 
system and how it reacts to input. The pendulum follows the differential Equation ([T^ with cuo = 
O.ls^^ and wg = 10 s“^, i{t) is a white noise force with {i{t)i{t')) = -1'), x{t) = sin(0(t)) and 

y{t) = -cos{(j){t)) are Cartesian coordinates of the point mass. The neural network has one input 
and three outputs which are fed back into the network; it learns to output the x- and the y-coordinate, 
as well as the angular velocity of the pendulum when it receives as input the strength of the angular 
force (noise plus control) ({t) + u{t) applied to the pivot axis of the pendulum. The learning is here 
interpreted as learning in a network with feedback, cf. Equation ( [^ . 

We created a training trajectory of length T* = 1000 s by simulating the pendulum with the given 
parameters and by driving it with white noise ^(t) as an exploratory control {u{t) = 0). Through its 
input, the PCSN receives the same white noise realization ^(t). During training the PCSN learns to 
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imitate the reaction of the pendulum to this control, more precisely its outputs learn to approximate 
the trajectories of x, y and co. As feedback to the reservoir during training we choose a convex 
combination of the reservoir output and the target (feedback = 0.9 • output + 0.1 • target). We find 
that such a combination improves performance: If the output at the beginning of the training is very 
erroneous, those errors are accumulated through the feedback-loop, which prevents the algorithm 
from working. On the other hand, if one feeds back only the target signal, the algorithm does not 
learn how to correct for feedback transmitted readout errors. In our task, the convex combination 
alleviates both problems. 

2. Control: In the second phase, the learned world model of the pendulum is used to compute 
stochastic optimal control that swings the pendulum up and keeps it in the inverted position. The 
PCSN does not learn its weights in this phase anymore. It receives the different realizations of 
exploratory (white noise) control and predicts the resulting motion (“mental exploration”). From this, 
the optimal control may be computed using the path integral framework 161]. In this framework a 
stochastic dynamical system (which is possibly multivariate) 

x(t) =f(x(t))+ u(x(t),i)+ ^(t) (23) 

with arbitrary nonlinearity f(x(t)) and white noise ${t), is controlled by the feedback controller u(x(t), t) 
to optimize an integral C{t) over a state cost U{x{i)) and a moving horizon quadratic control cost, 
C{t) = U{x{i)) + u{i)'^di. The reward is related to the cost by ii = -C. Path integral control 

theory shows that the control at time t can be computed by generating samples from the dynamical 
system under the uncontrolled dynamics 


i(i) =f(x(i)) + ^(i). 


(24) 


The control is then given by the success weighted average of the noise realizations Ci 


u{t) 



t-\-5 

j 

t 


(25) 


where Ci{t) = U{yii{i))di is the cost observed in the ith realization of the uncontrolled dynam¬ 
ics, which is driven by noise realization and w = 0. Equation (Q^ is a discrete approximation to 
Equation ([^. In our task. Equation ([^ becomes 


Z(i) = uj{t) 

Lo{t) = —Wq sin((^(t)) — cuJo^(t) + C{t) + u{t) 


and f7(x(i)) = —y{t) = cos{(f>{t)). 


Figure details 

The parameters of the different simulations are given in Table for simulations using saturating 
synapses and in Table for simulations using nonlinear dendrites. Further parameters and details 
about the figures and simulations are given in the following paragraphs. 

If not mentioned otherwise, for all simulations we use g = l.S^, p = 0.1, = 1^, = l|, Ai = 
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dt 
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Fig. 3e 

50 
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lOOS 

100ms 

100ms 

0.90 

0.03 

Fig. 4a-c 
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Table 2: Parameters used in the different figures for simuiations of networks with saturating synapses. 
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0 

100ms 

100ms 

As-ls 

Fig. 4d 
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Fig. 5 
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300 

0.1 

0.03 

1ms 

lOOOs 

20/iV^ 

100ms 

50ms 

As-lOs 


Table 3: Parameters used in the different figures for simuiations of networks with noniinear dendrites. The parameter 
a = \s - Xx is given in terms of As and Xx 


O.Ols, 7 = 0 and (Tr, = 0^. We note that for simulations with saturating synapses, we model the 
slow synaptic currents to possess synaptic time constants of 100ms (cf., e.g., j50l[60]). We usually 
use the same value for the slow synapses in networks with nonlinear dendrites. Upon rescaling time, 
these networks can be interpreted as networks with faster time constants, which learn faster target 
dynamics. Since the spike rates scale likewise, we have to consider larger networks to generate rates 
in the biologically plausible range (cf. Fig. F in SI Supporting Information). 


Figure 3 

Figure 3b,c: The PCSN has non-linear dendrites. The target signal is a sine with period 47rs and 
amplitude 2 (normalized to one in the figure). During recall, the neurons of the PCSN spike with 
mean rate 30.2Hz. 

Figure 3d: The PCSN has saturating synapses. The target signal is a saw tooth pattern with period 
2s and amplitude 10 (normalized to one in the figure). We used an Euler scheme here. The mean 
spike rate is 226Hz. 

Figure 3e: The task is performed by a PCSN with non-linear dendrites and by a PCSN with saturating 
synapses. The target signal is sin(t^) -h cos(t|). The mean spike rate is 77.8Hz for saturating 
synapses and 21.3Hz for non-linear dendrites. 

Figure 3f-h: The PCSN has nonlinear dendrites. As teacher we use the standard Lorenz system 

x(t) = a{y{t) - x{t)) 
y{t) = x{t){p - z{t)) - y{t) 
z{t) = x{t)y{t) - I3z{t) 

with parameters a = 10, p = 28, /3 = 8/3; we set the dimensionless temporal unit to 0.2s and scale 
the dynamical variables by a factor of 0.1. Panels (f,g) show a recall phase of 400s, panel (h) shows 
points from a simulation of 4000s. Panel (f) only shows every 10th data point, panel (g) shows every 
data point. The mean spike rate is 432Hz. 
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Figure 4 


Figure 4a-c: We quantified the memory capacity of a CSN with saturating synapses. The network 
has a sparse connectivity matrix A without autapses. We applied white noise with a,, = 0.001^. 
The input is a Gaussian bell curve with a = 0.2s and integral 10s (height normalized to one in the 
figure). The target is a Gaussian bell curve with a = Is and integral Is (height normalized to one 
in the figure). The target is presented several seconds after the input. Trials consisting of inputs 
and subsequent desired outputs are generated at random times with exponential inter-trial-interval 
distribution with time constant 10s and a refractory time of 100s. Training time is Tt = 800s, i.e. the 
network is trained with about 6 to 8 trials. Testing has the same duration with a similar number of 
trials. There is no feedback introduced by initialization or by learning, so the memory effect is purely 
inherent to the random network. We compute the quality of the desired output generation as the 
root mean squared (RMS) error between the generated and the desired response, normalized by the 
number of test trials. As reference, we set the error of the “extinguished” network, which does not 
generate any reaction to the input, to 1. Lower panels of Fig. 4a-c display medians and quartiles 
taken over 50 task repetitions. The sweep was done for time-delays 2 - 20s in steps of 0.5 s. 

Figure 4d: The PCSN has nonlinear dendrites. For this task a constant input of i^onst = b is added 
to the network with the elements of the vector b chosen uniformly from [0^, 250^] to introduce inho¬ 
mogeneity. Four different inputs are fed into the network, two continuous and two pulsed input 
channels The continuous inputs are created by convolving white noise twice with an exponential 
kernel e“*s (equivalent to convolving once with an alpha function) during training and during 
testing. The continuous input signals are normalized to have mean 0 and standard deviation 0.5. 
The pulsed instruction input is created by the convolution of a Poisson spike train with an exponen¬ 
tial kernel The rate of the delta pulses during training is 0.04|. During testing we choose a 
slower rate of 0.011 for a clearer presentation. In the rare case when two pulses overlap such that 
the pulsed signal exceeds an absolute value of 1.01 times the maximal pulse height of one, we shift 
the pulse by the minimal required amount of time to achieve a sum of the pulses below or equal to 
1.01. We use weights 5;*’^ = 100| for the pulsed inputs, = 250| for the continuous inputs and 
wf = 250^ for the feedback; g = 75^. The recurrent weights of the network are trained with respect 
to the memory target Fm{t). This target is +1 if the last instruction pulse came from /f and it is -1 
if the last pulse came from /|. During switching the target follows the integral of the input pulse. 
The corresponding readout is Zm- The second readout Zc is trained to output the absolute value of 
the difference of the two continuous inputs, if the last instruction pulse came from /f, and to output 
their sum, if the last instruction pulse came from /|. The specific analytical form of this target is 
= |/i(t) - /Kt)! -L l)/2 - (/f(t) -L f 2 {t))iFm{t) - l)/2. The mean spike rate is 5.53Hz. 

Figure 5 

Since we have white noise as input we use the Euler-Maruyama scheme in all differential equations. 
The PCSN has nonlinear dendrites. Non-plastic coupling strengths are w-f^y = 100^ for the feedback 
of the y-coordinate, = 100^ for the feedback of the x-coordinate, = 20^ for the feedback 
of the angular velocity and If* = II for the input. We introduce an additional random constant 
bias term into the nonlinearity to increase inhomogeneity between the neurons: The nonlinearity is 
tanh (bj + Wnjmrm{t)) where bj is drawn from a Gaussian distribution with standard deviation 
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0.01. The integration time 5 is 0.1s. During the control/testing phase, every A = 0.01s, M = 200 
samples of length Tr = Is are created, the cost function is weighted with Ac = 0.0l|. The mean spike 
rate is 146Hz. 
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1 Supporting text and figures 


1.1 Networks with nonlinear dendrites 

As stated in the main text, we can generalize Equation (3) by introducing fast connections that gen¬ 
erate discontinuities in postsynaptic neurons when a neuron spikes. We may then require that only a 
lower, say J-, dimensional combination x(i) of the iV-dimensional vectors Y{t) and r{t) is continu¬ 
ous, 

x(t) =LV(t) + fr(i), (SI) 

where L and f are J x N matrices. (For clarity of presentation we will use vector/matrix notation 
instead of components throughout the present section.) The benefit of this approach is that the 
spike trains of a larger population of neurons contribute to each x„, such that a modified analogue to 
Equation (9), 

x(t) Ri rr(t), (S2) 

with a J X N matrix r can hold even if the spike rates of individual neurons are low, i.e. if we make 
use of population/distributed coding. The matrix L is fixed (except for degenerate cases) as soon 
as the matrix f and the fast changes in V are fixed. However, it is not a priori clear how to choose 
the latter two; we need to employ some optimization scheme to ensure both a good approximation 
Equation ) [S2| and a low firing rate. 

For this, we start anew, and in contrast to the previous section with the dynamics for x(i). From these 
we will derive spiking dynamics approximating the x(t). We begin with a general J-dimensional 
nonlinear dynamical system yielding x(i), 

±{t) = i{x{t)) + c{t), (S3) 

where f(x) and c{t) are column vectors of functions fj{xi,...,XN) and external inputs Cj{t), respec¬ 
tively. We will generalize an approach introduced in refs. (H El 13] to nonlinear systems and derive 
spiking dynamics that optimally (see below) approximate x{t) satisfying Equation ( [S3| . The approach 
will yield Equation ( [ST| with a specific L as by-product. We will find that the dynamics of individual 
neurons depend on the fj and we will specify these functions such that the neural dynamics are 
biologically plausible and suitable for universal computation. 

We choose the momentary error or cost function 

E{t) = (x(t) - rr(i))^ -h (S4) 

to be minimized at each time t. The first term in E{t) induces the approximation Equation ) |S^ , the 
second term fosters a low spike rate with spiking distributed over all neurons. The error function 
respects causality as it depends implicitly via x{t) and r(t) on the past and restrains the dynamics at 
the current time t only. Minimizing E{t) at t means that a spike should be sent by neuron n if E{t) 
decreases due to this spike. Comparing En{t) (spike sending at time t by neuron n) with Eo{t) (no 
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spike sending) yields 


(85) 

( 86 ) 

(87) 


En{t) < Eo{t) 

(x(t) - rr(t) - re„)^ + ^ (r(i) + Snf < (x(i) - rr(t))^ + //r^(t) 
r„ • (x(t) - Tr{t)) - firnit) > 

where e„ denotes the n-th unit vector, e„ = (0,1,0, ...)^ (with a 1 in the n-th row), and r„ is the 
n-th column (vector) of the matrix r, r„ = re„. To obtain the familiar condition Vn{t) > On for neuron 
n to spike, the variable left hand side of Equation ) [S7| ) may be interpreted as membrane potential. 


Vn{t) = r„ • (x(t) - Tr{t)) - nrn{t), (88) 

V(i) = r'^(x(t)-rr(t))-//r(t), (89) 


the right hand side as threshold 



(810) 


We note that we can multiply both sides of the Equation by a factor and add constant terms, these 
change the scale of the potential, and shift the resting membrane potential, the reset and the thresh¬ 
old. Equation ) |S9| ) yields Equation with the pseudo-inverse of r^, L = (rr^)“^r, and f = 
r -|- /xL. 


We can now derive the sub-threshold dynamical Equations for V(t) from those for x(i) and r(t): 


V(i) = r^(i(i)-rr(t))-/ir(t) (811) 

= r'^f (x(t)) - (r'^r + /xi) (s(t) - A,r(t)) + r'^c(t), (8i2) 

where 1 denotes the x AT identity matrix. Assuming that the minimization of Equation ) [S^ yields 
small E{t), we may eliminate the dependence on x(t) using Equation ) [S^ , 

v(t) « r^f(rr(i)) - (r^r + /xi) (s(i) - A,r(i)) + r^c{t). (8i3) 


Finally, biological realism and increased stability of numerical simulations indicate that an additional 
leak term -Ay V should be introduced 

V(t) = -AyV(t) + r^f(rr(i)) - (r^r + /xi) {s{t) - A,r(t)) + r^c(t). (8i4) 


We now choose the fj as 


j 

fjixi, ...,xj) = -XxXj E Aji tanh (x,), (815) 

^=1 


such that 

V(t) = -AyV(t) + r^Atanh(rr(t)) - + /xl) s{t) + {aT^T + /xA,l) r(t) + r^c(i), (816) 

where a = Xg- Xx- This yields a spiking neural network that is suitable for universal computation: Its 
dynamics can be decoded via Equation ) [ST| (or Equation ) |S2| )) to resemble those of a J-dimensional 
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dynamical system of the form 


x(t) = —Xx'X-{t) + A tanh (x(i)) + c{t). 


(817) 


Systems of the form Equation are known to be suitable for universal computation (4l[3|6], in 
particular for appropriate A they can maintain longer-term fading memory. Since the x are dynam¬ 
ical quantities linearily derived from the underlying spiking network, already the underlying spiking 
network is suitable for computations. 


Furthermore, the structure of Equation ) |S16| ) allows for a straightforward interpretation in biological 
terms: The response of neuron n’s soma to slow input to its J dendrites is modeled by the term 
r„ • (Atanh(rr(t))). The inputs have non-negligible synaptic time constant (cf. r(t)), they are lin¬ 
early summed and thereafter subjected to a dendritic sublinearity (tanh). The coupling strength of 
a synaptic connection from neuron m to the jth dendrite of neuron n is given by Tjm, the coupling 
strength from the jth dendrite of neuron n to its soma is {T'^A)nj. Further fast and slow inputs ar¬ 
riving near the soma (and thus not subject to a dendritic non-linearity) are incorporated by the terms 
- (r^r -h ^l) s(t) and (ar'^r /iA^l) r{t). Their impact is characterized by the product r^r of the 
decoding matrix with itself and the comparably small weight n of the spike frequency penalty term, 
the positive diagonal terms incorporate the reset of the neurons after a spike and a slower recovery. 


1.2 A sufficient condition for the echo state property of the dynamics Equation (6) 

When does 


±{t) = -Ay [x(t)]_ - Aa; [x(t)] , -k Atanh([x(t)] , ) -h I{t) 


(SI 8) 


(Equation (6) of the main text) possess the echo state property? Dynamics have this property if after 
sufficiently long time any initial conditions are washed out and the state of the system is completely 
determined by the input. This is definitely the case if the distance between trajectories decays at 
least exponentially with a rate independent of the input [7]. We will prove the latter for our dynamics 
Equation For this, we will consider the difference A{t) = xi{t) - X2(t) and the Euclidean 

distance ||A(t)|| of two trajectories that satisfy Equation and have different initial conditions 
xi(0),x2(0) but the same input I{t). We will show that under the condition ||A|| < min(Ay, A^;), with 
||A|| being the spectral norm (the largest singular value) of A, an inequality ||A(i)|| < -e||A(t)|| 
holds for some e > 0 (as usual the dot denotes the temporal derivative of the entire expression below, 
here ||A(t)||). 

We start with the expression 2||A(t)||^ = A(t)A(t) and replace the right hand side using A{t) = 
xi(t) - X 2 (t) and Equation ) |S18| ), which leads to 


^ = (xi-X2)(-Ay([xi(t)]_-[x2(t)]_)-Aa;([xi(i)]_^-[x2(t)]+)) 

-k (xi - X2) (A (tanh([xi(i)]_^) - tanh([x2(i)]_^)) -k I{t) - I{t)) . (SI 9) 


To proceed we use the three inequalities xy < ||x|| ||y| 
||x - y||, which allow to estimate 


lAxll < IIAI 


X 


and ||tanh(x) — tanh(y)|| < 


(xi - X2) A (tanh([xi(t)]_^) - tanh([x2(i)]_^)) < ||xi -X2II ||A|| ||[xi(i)]^ - [x2(i)]_ 


(S20) 
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We now simplify the right hand side of the inequality further. For this we use that for every pair of real 
valued vectors x and y we have [x]± [y]± > 0, [x]^ [y]^ < 0 and [x]^ [x]^ = 0, since every element 
of [x]_^ is larger/equal zero while every element of [x]_ is smaller/equal zero, and elements which are 
nonzero in [x]_,_ are zero in [x]_ and vice versa. With this we get 

llxi-xaf = ||[xi(i)]_^ - [x2(t)]+||^ + ||[xi(t)]_ - [x2(t)]_||^ 

-2 [xi(i)]+ [X2(t)]_ - 2 [X2(t)]+ [xi(i)]_ 

> ||[xi(i)]+- [X2(t)]+|f , (S 21 ) 

since - [xi(t)]_^ [x 2 (i)]_ - [x 2 (i)]+ [xi(t)]_ > 0. The result can be used to bound the right hand side 
of Equation ( |S20} by a simpler expression, 

(xi - X2) A (tanh([xi(t)]_^) - tanh([x2(t)]_,_)) < ||xi - X2||^ || A|| . (S 22 ) 


Now we assume || A|| < min(Ay, A^;) such that we can write Ay = ey + || A|| and A^ = + || A|| with 

ey > 0 and €x > 0. Using this in Equation ( |S19) yields 


^(i)f = -(ev + ||A||) (xi-X2) ([xi(i)]_ - [x2(t)]_) 

- {e-x + II All) (xi - X2) ([xi(i)]_^ - [x2(t)]+) 

+ (xi - X2) A (tanh([xi(t)]_^) - tanh([x2(t)] + )) 

= -ey (xi - X2) ([xi(i)]_ - [x2(t)]_) - e,, (xi - X2) ([xi(t)]+ - [x2(i)]+) 

- II A|| ||xi - X2||^ + (xi - X2) A (tanh([xi(t)]_,_) - tanh([x2(t)]_,_)) , (S 23 ) 


and together with Equation ) |S22) 

1 „ 


< -ey (xi - X2) ([xi(t)]_ - [X2(i)]_) - ex (xi - X2) ([xi(i)]+ - [x2(t)]+) 

- ||A|| ||xi - X2||^ + ||xi - X2||^ ||A|| 

= -ey (xi - X2) ([xi(i)]_ - [X2(i)]_) - e^; (xi - X2) ([xi(t)]+ - [x2(i)]+) . (S24) 


Both terms on the right hand side are smaller or equal to zero, 

(xi - X 2 ) ([xi(t)]^ - [X 2 (i)]i) = ([xi(i)]^ - [X 2 (t)]i)^ - [xi(t)]^ [X2(i)]^ - [X2(t)]± [xi(t)]^ > 0. 


(825) 


We can therefore set e = min(ey, e^) > 0 and simplify 
1 „ 


< -e(xi -X 2 ) ([xi(t)]_ - [x 2 (t)]_) -e(xi - X 2 ) ([xi(t)]_^ - [x 2 (t)]+) 


= -e||A 


(826) 


which is equivalent to 

||A(t)|| < -e||A(t)||. 


(827) 
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Figure A: Comparison of PCSNs and Poisson coding iearning networks. Error of PCSNs and Poisson coding iearning 
networks with different spike rates after teaming continuous dynamics. The panet shows the median error to the saw tooth 
target pattern (of Fig. 3d) during testing, in equidistant bins of the network spike rate (shaded: intervais between first 
and third quartiie). The PCSN with its deterministic spike code reaches the same error ievet as the networks with simpte 
Poisson coding with aimost two orders of magnitude fewer spikes. 

The distance between different trajectories thus decreases at least exponentially fast with rate e, for 
any input. We may conclude that || A|| < min(Ai/, A^;) provides a sufficient condition for the system to 
possess the echo-state property. 

1.3 Comparison of PCSNs with Poisson coding iearning networks 

In the following, we compare the performance of PCSNs and Poisson coding learning networks. To 
enable a direct comparison, we use PCSNs with saturating synapses and Poisson coding networks 
of the same size and with the same learning rule for the recurrent synapses such that in the high-rate 
limit, both network types become equivalent to the same continuous networks. As a specific task 
for the comparison, we choose learning of a saw tooth-like signal as displayed in Fig. 3d. We find 
that both networks perform better for higher rates. However, due to their deterministic, precise spike 
code, the PCSNs achieve the same error levels with almost two orders of magnitude smaller rates, 
cf. Fig. A. This is generally a consequence of the fact that the population coding error in precisely 
spiking networks is much smaller than in Poisson coding networks, ref. [3] shows it to be proportional 
to yN (where N is the number of neurons in the network), while a simple Poisson population code 
has precision ^/Vn. However, in our PCSNs we have additional learning whose consequences on the 
precision of the output signal are not easy to determine. 

The Poisson models are setup as follows: We start with the continuous target dynamics Equation (6) 
and for simplicity consider A^; = Ay and j = 0, i.e. 

N 

Xnit) = -XvXn{t) -k E ^nm tanh([xm(t)]+) (S28) 

m=l 
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The state of the corresponding Poisson unit n shall be characterized by we aim at Un{t) ^ Xn{t) 
for high spike rates. For the spike generation, we orient at standard models (e.g. [6]) and at keeping 
the dynamical Equations simple. 

As Poisson model 1, we use networks of units with threshold and nonlinear saturation, specifically 
unit n has the rate 

I'nit) = sotanh ([u„(i)]_^) . (S29) 

The constant sq allows to modulate the rate without changing the dynamics of u. Given Vn{t), the 
unit generates an inhomogeneous Poisson spike train Sn{t) (cf. Equation (1)) with this rate. The spike 
train in turn generates postsynaptic inputs with decay time constant A^, as given in Equation (2). 
When rescaled with A^/so- the postsynaptic inputs satisfy for large so 

—rn{t) = tanh ([n„(t)]_^) . (S30) 

A network with dynamics 


N 


Un^t) — \yUn(i) + ^ ^ -^nm ^m(^) T 


m=l 


SO 


(831) 


then approximates the continuous dynamics Equation ( |S28) . 
As Poisson model 2, we use networks of linear threshold units. 


t'n(t) = So [Unit)], . 


They yield for large so 


and 


Unit) 


— Tnit) ^ [Unit)] 
So 


N 

-XyUnit) + ^ Anm tanh 

m=l 



+ Ie,n{t) 


(832) 

(833) 


for the network dynamics approximating Equation \S28\ . We note that this model also satisfies a 
decoding Equation analogous to Equation (9), ^Vnit) « [xn(i)]+. 

We change the threshold 6 (PC8Ns) and the base rates so (Poisson networks) to generate networks 
with different rates. For PC8Ns the remaining parameters are adapted such that the corresponding 
continuous network is also given by Equation ( |828) . For each parameter value we train 75 networks 
with different random topology and different initial conditions. We thereafter compute the actually 
generated average spike rates within log-scale equidistant bins. Further, we compute the root mean 
squared (RM8) error between the desired signal and the signal generated during testing. Fig. A 
displays the median and the first and third quartiles of the error versus the average rate in double 
logarithmic scale. 


1.4 Robustness against noise 

PC8N learning is robust against noise. Fig. B shows this by example of the learning of the saw 
tooth pattern (cf. Figs. 3d, A), both for PC8Ns with saturating synapses and nonlinear dendrites. 
In each time step of the Euler-Maruyama integration, Gaussian noise is added. The noise level is 
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Figure B: Robustness of PCSNs against noise. The figure shows the median RMS error between the output of PCSNs 
and the saw tooth target pattern during testing, versus the noise levei (shaded: intervais between first and third quartile). 
The noise ievei is given in terms of the standard deviation generated by a purety noise-driven subthreshold membrane po¬ 
tential (Ornstein-Uhlenbeck process) with membrane time constant \v, in multiples of the threshold. The insets display the 
testing phase for two examples using the setup with nonlinear dendrites (crosses in the main plot denote the corresponding 
noise and error levels). 

given in terms of the standard deviation generated by a purely noise-driven subthreshold membrane 
potential (Ornstein-Uhlenbeck process) with membrane time constant Ay, in multiples of the threshold 
(which equals 9/2 in the case of saturating synapses and 9 in the case of non-linear dendrites), 
i.e. noise-level := The error is determined as RMS error between the desired signal 

and the signal generated during testing. 

1.5 Robustness of PCSNs with nonlinear dendrites against structural perturbations 

In CSNs with nonlinear dendrites, the optimal strengths of the couplings from other neurons to the 
nonlinear dendrites and the fast couplings are independent of the parameters of the encoded nonlin¬ 
ear dynamical system Equation ( |S17| ). Since deviations from the optimal values in these couplings 
in general do not imply a simple change in the encoded dynamics but impair the coding scheme, we 
here investigate robustness of PCSN-learning against them. 

The optimal coupling strength from neuron m to the jth nonlinear dendrite of neuron n is Wnjm = 

Here we test the robustness of the learning scheme against deviations from the optimal couplings. 
We find that PCSN learning is robust, if we modify the neuron model to have a “precise reset”, 
i.e. the reset is always to the fixed value -9 (-29 below threshold 9), even if fast excitation to a 
suprathreshold potential caused the spike (Fig. C). In the native model the reset has fixed size -29 
such that the membrane potential would be reset to a value larger than -9 after a suprathreshold 
excitation. We note that the reset to a fixed value may also be biologically more plausible than a reset 
of fixed size. 
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Figure C: Robustness of PCSNs against structurai perturbations of dendritic coupling. The figure shows perfor¬ 
mance of native PCSNs (networks without precise reset, green) and for PCSNs where neurons are reset to a fixed mem¬ 
brane potential after spike generation (networks with precise reset, blue). Displayed is the median RMS error between 
the output of PCSNs and the saw tooth target pattern during testing versus the standard deviation Wpert of a multiplicative 
Gaussian perturbation in the connectivity (shaded: intervals between first and third quartile). The insets display the testing 
phase for two examples using the setup with precise reset (crosses in the main plot denote the corresponding Wpert and 
error levels). While the native model does not show successful learning for perturbations larger than 2% of the connection 
weights, networks with precise reset generate a recognizable sawtooth pattern as learned output even for perturbations 
of 20% (inset at lower right). The error generated by networks with precise reset increases gradually with perturbation 
strength, while there is an abrupt change for the native networks. 
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Figure D: Robustness of PCSNs against reduction of fast coupiings. The panel shows the median RMS error between 
the output of PCSNs and the saw tooth target pattern during testing versus the size spen of the multiplicative reduction of 
the fast connections (shaded: intervals between first and third quartile). For spert < 10% the error increases only slightly 
with increasing spen- A further increase of spen leads to a strong, rapid increase in the error and the PCSN is soon not able 
to learn the pattern anymore. 
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Fig. C shows this by example of the learning of the saw tooth pattern (of. Figs. 3d, A). The Wnjm are 
perturbed proportionally to the strength of their optimal values, the perturbed couplings are given by 
Wnjm = Tjm ) Where the are independently drawn from a Gaussian distribution 

with mean zero and variance one. We adopt the interpretation of the PCSNs as networks with plastic 
recurrent connections: The outputs, the output weight updates, the inverse correlation matrices and 
the updates of the dendrite-to-soma weights Dnj are computed using Equations (14), (18), (19) and 
(22) with unperturbed readouts fj(t) = tanh Tjmrm{t)^ ■ We note that updating the Dnj is not 

equivalent to a static feedback of the updated overall network readout anymore, since the latter does 
not contain the perturbed Wnjm. 

Our findings raise the question why the networks with precise reset are much more stable to pertur¬ 
bations in the dendrites. We find that the instability in the simulations of the conventional model is 
due to an explosion of the spike rates, such that every neuron spikes once at every simulated time 
step. This is due to the “ping pong effect” already described in ref. [3]: Neurons that spike due to 
excitation from fast connections generate further suprathreshold excitation and in the end all neurons 
have spiked within a single step. In [3J this problem is solved using a higher value of n. We find for 
our simulations with perturbed Wnjm that a fine tuning of /i is required to prevent a breakdown of 
the learning. In contrast, the precise reset solves this problem robustly. This is a consequence of 
the fact that on the one hand the precise reset yields a membrane potential that is further away from 
the threshold and thus reduces the chance of re-excitation of a neuron that has recently spiked (/i 
has an in principle similar effect). On the other hand, the difference from the theory arises only after 
suprathreshold excitation, the change in the precise spiking dynamics is small and the spike coding 
scheme is preserved. 

The optimal strength of a fast coupling from neuron m to neuron n is Unm = ^j=i^jn^jm + fJ^^nm 
(which is independent of N for fixed neuron threshold). In “balanced state” irregular spiking networks, 
such recurrent connections may be expected to be much smaller, since they scale with 1/Vn for 
fixed neuron thresholds [9]. We therefore test the dependence of the PCSNs on these connections 
by a multiplicative weakening by a factor 1 - spert, Unm = (1 - spen)Unm tor n ^ m (resets are 
kept * = Unn). Fig. D shows that PCSNs are robust against this: The learning capabilities are 
conserved, if spert < 10%. The figure also indicates that the fast connections are important for PCSN 
functionality, since the learning abilities are quickly lost as spert increases beyond this range. 

1.6 Comparison of network sizes and the spike rate-network size trade-off 

We illustrate that for PCSNs, network sizes comparable to those of continuous rate networks solving 
the same task with FORCE learning can be sufficient. Further, we show that there is a trade-off 
between network size and spike rate of individual neurons. As example we use the “camel’s hump” 
task (cf. Fig. 3e). Fig. Fa shows that continuous networks Equation (6) can learn the signal well for 
about N > 40, we use iV = 50 as a reference. We compare with PCSNs with nonlinear dendrites 
that encode a system Equation ( 6 ) with the same parameters (in particular J = 50), and are trained 
to solve the same task. We compare the RMS error and spiking frequencies for different PCSN 
sizes N and 7 ^, which regulates the threshold of the neurons (cf. Methods and Equation ) |S10| )). 
Fig. Fb,c shows the trade-off between the number of neurons in the PCSNs and their individual 
spiking frequency: For 7 ^ ss 0.1 - 0.15 only the networks with N = 200 and N = 400 learn the task 
reliably (panel (b)), the neurons adopt a mean spike rate of about or smaller lOOHz. For sufficiently 
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Delay (s) 


Figure E: Memory duration in CSNs with different recurrent coupling strengths. Supporting figure to Fig. 4a-c, 
dispiaying a direct comparison between muitipie error vs. reaction deiay traces. A disconnected network, g\~^ = 0, has 
comparabiy short memory, increase of connection strength ieads to an increase of memory duration (of the trace for 
gX~^ = 0.8j. Memory is most persistent around gX~^ = 1.6, and decreases for iarger coupting strengths, as expected for 
systems where the dynamics become more and more chaotic (gX~^ = 2, gX~^ = 8). 
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Figure F: Network size and PCSN spiking rate trade-off. (a): RMS errors of continuous rate networks of different 
size, after training the carnet’s hump task (cf. Fig. 3e). The networks generate the pattern wet! for network sizes targer 
approximatety N = 40. (b,c): RMS errors of PCSNs with noniinear dendrites, after training the camel’s hump task. The 
panels display the trade-off between network size and spike frequency: For N = 200, a small error can be achieved with a 
single neuron spike rate oflOOHz (cf RMS error at'y^ « 0.1 displayed in (b) and rate af 7 ^ « 0.1 displayed in (c)). Networks 
with N = 400 need only a spike rate of lOHz. Smaller networks (e.g. N = 70) need higher spike rates but reach the same 
error levels. (We note that for small networks we observe a nonmonotonic dependence of the spike rate on the network 
size for constant js-) (d): Example dynamics. The PCSN signal (red trace, 7 ^ = 0.01, N = 70, J = 50J approximates the 
target function (yellow trace) similarly well as a continuous rate neuron network of similar size (blue trace, N = 50). 


small 7 s also smaller networks learn the task well, but all networks generate a higher spike frequency. 


1.7 Error evolution for longer times 

We do not observe lasting changes of the error in long term simulations. For periodic signals, there 
is an inevitable phase shift. It originates from the small error between the period of the desired and 
the learned signal; this error accumulates over time. Apart from that, tested features such as the 
deviation from the desired dynamics (RMS error) and the spike frequency are remarkably constant. 
Fig. Ga,b illustrates and quantifies this for the camel’s hump task (Fig. 3e). Fig. Ga displays the 
continued desired dynamics and the occurring phase shift for longer recall durations. Fig. Gb shows 
that the RMS error is stationary, approximately constant over time, if one corrects for the shift. For 
the Lorenz attractor (Fig. 3f-h), the teacher and student trajectories quickly depart from each other 
after the end of learning. The spiking network nevertheless continues to generate dynamics that 
after decoding agree with the dynamics of a Lorenz system. Fig. Gc shows this for longer times. As a 
quantitative check, the tent map Fig. 3h relating subsequent local maxima in the z-coordinate shows 
the good agreement with the tent-map generated by the teacher dynamics also for long times. In 
the displayed simulation, the Lorenz dynamics deviate three times from the desired dynamics, which 
leads to the six outliers in Fig. 3h. However, the dynamics return every time to the desired dynamics 
such that the errors just generate single outlier pairs in the tent map and the qualitative dynamics 
agree with those of the Lorenz system still after 4000s. 


13 






















time (s) 


Figure G: Long term evolution, (a): PCSN generated camels’s hump signal and target (Figs. 3e, F) after long times: the 
signal is phase-shifted compared to the target but otherwise not noticeably changed. Panel (b) quantifies this observation 
by plotting the RMS error, corrected for a possible phase shift to the target, against time. Displayed are several learning 
trials with different random initial connectivity. The error is stationary, constant except for fluctuations, (c): Lorenz attractor 
signal of Fig. 3f-h, dynamics ofx,y,z i/s. time. Also after long times, the PCSN (red trace) in general generates qualitatively 
the same dynamics as the target Lorenz system (yellow trace). 
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Fig. B 
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0.1ms 
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50ms 

0.90 
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Table A: Parameters used in the different figures for simulations of networks with saturating synapses. 

2 Supporting methods 

2.1 Details on the supporting figures 

The parameters of the different simulations are given in Table 0 for simulations using saturating 
synapses and in Table [^for simulations using nonlinear dendrites. Further parameters and details 
about the figures and simulations are given in the following paragraphs. 

If not mentioned otherwise, for all simulations we use g = 1.5^, p = 0.1, wf = 1^, = l|. At = 

0.01s, 7 = 0 and (Jr, = 0^. 


Figure A 

The signal has period 2s and amplitude 10. The parameters of the Poisson networks are N = 50, 
a = 0.01, g = 1.5^, Tt = 10.5s, = lOOms, Xy^ = Is, dt = O.I/sq, the sparse matrix A has a 

fraction p = 0.1 of nonzero entries, which are drawn from a Gaussian distribution with zero mean and 
variance so is swept between 50^ and 21544^ (values in the different sweeps: so = 50|, 100|, 
150^, 200|, 215|, 464|, 1000|, 2150|, 464l|, 10000|, 21544|). The PCSN has saturating synapses. 
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Fig. D 
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Fig. Fb, c 
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Fig. Fd 
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Fig. Ga, b 
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Table B: Parameters used in the different figures for simuiations of networks with nonlinear dendrites. The parameter 
a = \s - Xx is given in terms of Xs and Xx 

0 is swept between 0.5 and 0.01 (specific values of the sweep: 0 =0.5, 0.4, 0.3, 0.2, 0.1, 0.09, 0.08, 
0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01). We compute the RMS error between the signal and the target 
in the first 10.5s after training. The error is computed using a normalized version of the signal, where 
the amplitude is set to 1. 


Figure B 

The signal has period 2s and amplitude 15 (normalized to one in the figure). We take medians and 
quartiles over 50 trials. The noise level is given in terms of the standard deviation generated by a 
purely noise-driven subthreshold membrane potential (Ornstein-Uhlenbeck process) with membrane 
time constant Ay, in multiples of the threshold (which equals 9/2 in the case of saturating synapses 
and 0 in the case of non-linear dendrites), i.e. noise-level := Plotted are the median of 

the RMS error (shaded: intervals between first and third quartile) between the signal and the target in 
the first 30s after training. The error is computed using the normalized version of the signal, where the 
amplitude is set to 1. The sweep covers 101 equidistant values of an from 0^ to 0.01^ in the case 
of non-linear dendrites and 101 equidistant values of an from 0^ to 0.2^ in the case of saturating 
synapses. 

Figure C 

The signal has period 2s and amplitude 10 (normalized to one in the figure). We take medians and 
quartiles over 50 trials. We use the Euler method to integrate the differential Equations. 

Plotted are the median of the RMS error (shaded: intervals between first and third quartile) between 
the signal and the target in the first 30s after training. The error is computed using the normalized 
version of the signal, where the amplitude is set to 1. The sweep covers 80 equidistant values of ITpert 
from 0 to 0.2. 

For simulations that showed pathological spiking (more than 200 spikes per time step in the numerical 
simulation) we assigned an infinite error. 


Figure D 

The signal has period 2s and amplitude 10 (normalized to one in the figure). We take medians and 
quartiles over 50 trials. We use the Euler method to integrate the differential Equations. 

Plotted are the median of the RMS error (shaded: intervals between first and third quartile) between 
the signal and the target in the first 30s after training. The error is computed using the normalized 
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version of the signal, where the amplitude is set to 1 . The sweep covers 80 equidistant values of spert 
from 0 to 0.3 (displayed is the range up to 0.2). 

For simulations that showed pathological spiking (more than 200 spikes per time step in the numerical 
simulation) we assigned an infinite error. 

Figure E 

The parameters are as in Fig. 4a-c, lower panels, see “Figure details” in the main text. 

Figure F 

Fig. Fa: The continuous networks obey Equation (11) (Equation ( |S17) ), they are endowed with the 
FORCE learning rule. The parameters of the network are A^; = 1 ^, dt = 0 . 01 s, and the sparse matrix 
A has a fraction p = 0.1 of nonzero entries, which are drawn from a Gaussian distribution with zero 
mean and variance ^ with g = 1.5^. The task from Fig. 3e serves as target signal. The learning 
rate is a = 1, the learning time is Tt = 100.5s. The network size N is swept from 10 to 100 in steps of 
10 . The figure shows the median of the RMS error (shaded: intervals between first and third quartile) 
between the signal and the target in the first 10 s after training. The statistics are based on 100 trials 
per value of N. 

Fig. Fb,c: We use PCSNs with nonlinear dendrites, which encode continuous dynamics as gener¬ 
ated by the rate networks in panel (a). 7 ^ is swept over 0.4,0.25,0.15,0.1,0.09,0.075,0.06,0.05,0.04, 
0.025,0.015,0.01 and networks of size N are plotted for N = 70,100,200,400. Fig. Fb shows the 
median of the RMS error (shaded: interval between first and third quartile) between the signal and 
the target in the first 10 s after training. Fig. Fc shows the median of the mean spike rate per neuron 
(shaded: intervals between first and third quartile). The statistics are based on 20 trials per parameter 
combination. 

Fig. Fd: The plot shows example patterns generated by continuous networks as used in panel (a) 
(blue trace) and PCSNs as used in panels (b,c) (red trace). For the continuous network N = 50, for 
the PCSNs 7 s = 0 . 01 , iV = 70, J = 50. The PCSN has a mean spike rate of 441 Hz 

Figure G 

Fig. Ga: The panel shows the last 100 s of the PCSN simulation displayed in Fig. Fd. Total duration 
of the recall phase is 10000 s, Fig. Fd shows the first 100 s. 

Fig. Gb: We use the same network parameters as in Fig. Fd and Fig. Ga, displayed are 10 learning 
trials with different randomly chosen initial connectivity. We compute a phase-shift corrected version 
of the RMS error in a sliding window of size 100 s, for different starting points r of the sliding window. 
T is in the range from Is to 10000 s with step size Is. The phase-shift corrected version of the RMS 
error is computed by jt-hioos ^signal(t) - target(t + A)^ dt with the phase-shift A of the target 
signal chosen such that the integral is minimal. 

Fig. Gc: The parameters are the same as in Fig. 3h. 
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